go语言的GC

作者：岳述穗__ | 来源：互联网 | 2023-08-04 05:14

mark&sweep,2分钟

mark&sweep, 2分钟保证至少一次GC过程，如果分配的总内存超过上次分配的总内存一定比例(默认100%)后进行一次GC
进行mark的过程中，会停止一切G的运行，mark的过程是多任务并发的
sweep的过程是分散的

mark过程

整个程序内存块包括 .data， .bss，每个G的stack， SpecialFinalizer
每段内存都有其相应的bitmap，用来表示每个word（8BYTE）的标志位，每word需要4bit的标志位
mark的过程就是递归遍历内存块bitmap的过程
word标志位有如下3种类型:

标量

指针

连续两个word表示一个iface或eface
// scanblock scans a block of n bytes starting at pointer b for references
// to other objects, scanning any it finds recursively until there are no
// unscanned objects left. Instead of using an explicit recursion, it keeps
// a work list in the Workbuf* structures and loops in the main function
// body. Keeping an explicit work list is easier on the stack allocator and
// more efficient.
static void
scanblock(byte b, uintptr n, byte ptrmask)
{
byte obj, obj0, p, arena_start, *arena_used, **wp, scanbuf[8], ptrbitp, bitp;
uintptr i, j, nobj, size, idx, x, off, scanbufpos, bits, xbits, shift;
Workbuf wbuf;
Iface iface;
Eface eface;
Type typ;
MSpan s;
pageID k;
bool keepworking;
// Cache memory arena parameters in local vars. arena_start = runtime·mheap.arena_start; arena_used = runtime·mheap.arena_used; wbuf = getempty(nil); nobj = wbuf->nobj; wp = &wbuf->obj[nobj]; keepworking = b == nil; scanbufpos = 0; for(i = 0; i scanbuf[i] = nil; ptrbitp = nil; // ptrmask can have 2 possible values: // 1. nil - obtain pointer mask from GC bitmap. // 2. pointer to a compact mask (for stacks and data). if(b != nil) goto scanobj; for(;;) { if(nobj == 0) { // Out of work in workbuf. // First, see is there is any work in scanbuf. for(i = 0; i b = scanbuf[scanbufpos]; scanbuf[scanbufpos ] = nil; scanbufpos %= nelem(scanbuf); if(b != nil) { n = arena_used - b; // scan until bitBoundary or BitsDead ptrmask = nil; // use GC bitmap for pointer info goto scanobj; } } if(!keepworking) { putempty(wbuf); return; } // Refill workbuf from global queue. wbuf = getfull(wbuf); if(wbuf == nil) return; nobj = wbuf->nobj; wp = &wbuf->obj[nobj]; } // If another proc wants a pointer, give it some. if(runtime·work.nwait > 0 && nobj > 4 && runtime·work.full == 0) { wbuf->nobj = nobj; wbuf = handoff(wbuf); nobj = wbuf->nobj; wp = &wbuf->obj[nobj]; } wp--; nobj--; b = *wp; n = arena_used - b; // scan until next bitBoundary or BitsDead ptrmask = nil; // use GC bitmap for pointer info scanobj: if(DebugPtrs) runtime·printf("scanblock %p %p %p\n", b, n, ptrmask); // Find bits of the beginning of the object. if(ptrmask == nil) { off = (uintptr*)b - (uintptr*)arena_start; ptrbitp = arena_start - off/wordsPerBitmapByte - 1; } for(i = 0; i obj = nil; // Find bits for this word. ...... if(bits <= BitsScalar) // BitsScalar || BitsDead continue; if(bits == BitsPointer) { obj = *(byte**)(b i); obj0 = obj; goto markobj; } // With those three out of the way, must be multi-word. if(Debug && bits != BitsMultiWord) runtime·throw("unexpected garbage collection bits"); // Find the next pair of bits. if(ptrmask == nil) { bits = *ptrbitp; j = ((uintptr)b i PtrSize)/PtrSize & 1; ptrbitp -= j; bits >>= gcBits*j; bits = (bits>>2)&BitsMask; } else bits = (ptrmask[((i PtrSize)/PtrSize)/4]>>((((i PtrSize)/PtrSize)%4)*BitsPerPointer))&BitsMask; if(Debug && bits != BitsIface && bits != BitsEface) runtime·throw("unexpected garbage collection bits"); if(bits == BitsIface) { iface = (Iface*)(b i); if(iface->tab != nil) { typ = iface->tab->type; if(!(typ->kind&KindDirectIface) || !(typ->kind&KindNoPointers)) obj = iface->data; } } else { eface = (Eface*)(b i); typ = eface->type; if(typ != nil) { if(!(typ->kind&KindDirectIface) || !(typ->kind&KindNoPointers)) obj = eface->data; } } i = PtrSize; obj0 = obj; markobj: // At this point we have extracted the next potential pointer. // Check if it points into heap. if(obj == nil) continue; if(obj = arena_used) { if((uintptr)obj s = nil; goto badobj; } continue; } // Mark the object. obj = (byte*)((uintptr)obj & ~(PtrSize-1)); off = (uintptr*)obj - (uintptr*)arena_start; bitp = arena_start - off/wordsPerBitmapByte - 1; shift = (off % wordsPerBitmapByte) * gcBits; xbits = *bitp; bits = (xbits >> shift) & bitMask; if((bits&bitBoundary) == 0) { // Not a beginning of a block, consult span table to find the block beginning. ...... obj = p; goto markobj; } if(DebugPtrs) runtime·printf("scan *%p = %p => base %p\n", b i, obj0, obj); if(nbadblock > 0 && (uintptr)obj == badblock[nbadblock-1]) { // Running garbage collection again because // we want to find the path from a root to a bad pointer. // Found possible next step; extend or finish path. for(j=0; j if(badblock[j] == (uintptr)b) goto AlreadyBad; runtime·printf("runtime: found *(%p %p) = %p %p\n", b, i, obj0, (uintptr)(obj-obj0)); if(ptrmask != nil) runtime·throw("bad pointer"); if(nbadblock >= nelem(badblock)) runtime·throw("badblock trace too long"); badblock[nbadblock ] = (uintptr)b; AlreadyBad:; } // Now we have bits, bitp, and shift correct for // obj pointing at the base of the object. // Only care about not marked objects. if((bits&bitMarked) != 0) continue; // If obj size is greater than 8, then each byte of GC bitmap // contains info for at most one object. In such case we use // non-atomic byte store to mark the object. This can lead // to double enqueue of the object for scanning, but scanning // is an idempotent operation, so it is OK. This cannot lead // to bitmap corruption because the single marked bit is the // only thing that can change in the byte. // For 8-byte objects we use non-atomic store, if the other // quadruple is already marked. Otherwise we resort to CAS // loop for marking. if((xbits&(bitMask|(bitMask< runtime·work.nproc == 1) *bitp = xbits | (bitMarked< else runtime·atomicor8(bitp, bitMarked< if(((xbits>>(shift 2))&BitsMask) == BitsDead) continue; // noscan object // Queue the obj for scanning. PREFETCH(obj); p = scanbuf[scanbufpos]; scanbuf[scanbufpos ] = obj; scanbufpos %= nelem(scanbuf); if(p == nil) continue; // If workbuf is full, obtain an empty one. if(nobj >= nelem(wbuf->obj)) { wbuf->nobj = nobj; wbuf = getempty(wbuf); nobj = wbuf->nobj; wp = &wbuf->obj[nobj]; } *wp = p; wp ; nobj ; } ............ }
}
static void
markroot(ParFor desc, uint32 i)
{
FinBlock fb;
MSpan s;
uint32 spanidx, sg;
G gp;
void *p;
uint32 status;
bool restart;
USED(&desc); // Note: if you add a case here, please also update heapdump.c:dumproots. switch(i) { case RootData: scanblock(runtime·data, runtime·edata - runtime·data, runtime·gcdatamask.bytedata); break; case RootBss: scanblock(runtime·bss, runtime·ebss - runtime·bss, runtime·gcbssmask.bytedata); break; case RootFinalizers: for(fb=runtime·allfin; fb; fb=fb->alllink) scanblock((byte*)fb->fin, fb->cnt*sizeof(fb->fin[0]), finptrmask); break; case RootSpans: // mark MSpan.specials ...... break; case RootFlushCaches: flushallmcaches(); break; default: // the rest is scanning goroutine stacks if(i - RootCount >= runtime·allglen) runtime·throw("markroot: bad index"); gp = runtime·allg[i - RootCount]; // remember when we've first observed the G blocked // needed only to output in traceback status = runtime·readgstatus(gp); if((status == Gwaiting || status == Gsyscall) && gp->waitsince == 0) gp->waitsince = runtime·work.tstart; // Shrink a stack if not much of it is being used. runtime·shrinkstack(gp); if(runtime·readgstatus(gp) == Gdead) gp->gcworkdOne= true; else gp->gcworkdOne= false; restart = runtime·stopg(gp); scanstack(gp); if(restart) runtime·restartg(gp); break; }
}

sweep过程

sweep扫描span里面每一个对象是否marked，将未marked的对象放入span的freelist中
如果span中的所有对象都进入了freelist，那么会将span的内存释放到heap中。

// sweeps one span // returns number of pages returned to heap, or -1 if there is nothing to sweep uintptr runtime·sweepone(void) { MSpan *s; uint32 idx, sg; uintptr npages; // increment locks to ensure that the goroutine is not preempted // in the middle of sweep thus leaving the span in an inconsistent state for next GC g->m->locks ; sg = runtime·mheap.sweepgen; for(;;) { idx = runtime·xadd(&runtime·sweep.spanidx, 1) - 1; if(idx >= runtime·work.nspan) { runtime·mheap.sweepdOne= true; g->m->locks--; return -1; } s = runtime·work.spans[idx]; if(s->state != MSpanInUse) { s->sweepgen = sg; continue; } if(s->sweepgen != sg-2 || !runtime·cas(&s->sweepgen, sg-2, sg-1)) continue; npages = s->npages; if(!runtime·MSpan_Sweep(s, false)) npages = 0; g->m->locks--; return npages; } } // Sweep frees or collects finalizers for blocks not marked in the mark phase. // It clears the mark bits in preparation for the next GC round. // Returns true if the span was returned to heap. // If preserve=true, don't return it to heap nor relink in MCentral lists; // caller takes care of it. bool runtime·MSpan_Sweep(MSpan *s, bool preserve) { int32 cl, n, npages, nfree; uintptr size, off, step; uint32 sweepgen; byte *p, *bitp, shift, xbits, bits; MCache *c; byte *arena_start; MLink head, *end, *link; Special *special, **specialp, *y; bool res, sweepgenset; // It's critical that we enter this function with preemption disabled, // GC must not start while we are in the middle of this function. if(g->m->locks == 0 && g->m->mallocing == 0 && g != g->m->g0) runtime·throw("MSpan_Sweep: m is not locked"); sweepgen = runtime·mheap.sweepgen; if(s->state != MSpanInUse || s->sweepgen != sweepgen-1) { runtime·printf("MSpan_Sweep: state=%d sweepgen=%d mheap.sweepgen=%d\n", s->state, s->sweepgen, sweepgen); runtime·throw("MSpan_Sweep: bad span state"); } arena_start = runtime·mheap.arena_start; cl = s->sizeclass; size = s->elemsize; if(cl == 0) { n = 1; } else { // Chunk full of small blocks. npages = runtime·class_to_allocnpages[cl]; n = (npages < } res = false; nfree = 0; end = &head; c = g->m->mcache; sweepgenset = false; // Mark any free objects in this span so we don't collect them. for(link = s->freelist; link != nil; link = link->next) { off = (uintptr*)link - (uintptr*)arena_start; bitp = arena_start - off/wordsPerBitmapByte - 1; shift = (off % wordsPerBitmapByte) * gcBits; *bitp |= bitMarked< } // Unlink & free special records for any objects we're about to free. specialp = &s->specials; special = *specialp; while(special != nil) { // A finalizer can be set for an inner byte of an object, find object beginning. p = (byte*)(s->start <offset/size*size; off = (uintptr*)p - (uintptr*)arena_start; bitp = arena_start - off/wordsPerBitmapByte - 1; shift = (off % wordsPerBitmapByte) * gcBits; bits = (*bitp>>shift) & bitMask; if((bits&bitMarked) == 0) { // Find the exact byte for which the special was setup // (as opposed to object beginning). p = (byte*)(s->start <offset; // about to free object: splice out special record y = special; special = special->next; *specialp = special; if(!runtime·freespecial(y, p, size, false)) { // stop freeing of object if it has a finalizer *bitp |= bitMarked < } } else { // object is still live: keep special record specialp = &special->next; special = *specialp; } } // Sweep through n objects of given size starting at p. // This thread owns the span now, so it can manipulate // the block bitmap without atomic operations. p = (byte*)(s->start < // Find bits for the beginning of the span. off = (uintptr*)p - (uintptr*)arena_start; bitp = arena_start - off/wordsPerBitmapByte - 1; shift = 0; step = size/(PtrSize*wordsPerBitmapByte); // Rewind to the previous quadruple as we move to the next // in the beginning of the loop. bitp = step; if(step == 0) { // 8-byte objects. bitp ; shift = gcBits; } for(; n > 0; n--, p = size) { bitp -= step; if(step == 0) { if(shift != 0) bitp--; shift = gcBits - shift; } xbits = *bitp; bits = (xbits>>shift) & bitMask; // Allocated and marked object, reset bits to allocated. if((bits&bitMarked) != 0) { *bitp &= ~(bitMarked< continue; } // At this point we know that we are looking at garbage object // that needs to be collected. if(runtime·debug.allocfreetrace) runtime·tracefree(p, size); // Reset to allocated noscan. *bitp = (xbits & ~((bitMarked|(BitsMask<<2))< if(cl == 0) { // Free large span. if(preserve) runtime·throw("can't preserve large span"); runtime·unmarkspan(p, s->npages< s->needzero = 1; // important to set sweepgen before returning it to heap runtime·atomicstore(&s->sweepgen, sweepgen); sweepgenset = true; // NOTE(rsc,dvyukov): The original implementation of efence // in CL 22060046 used SysFree instead of SysFault, so that // the operating system would eventually give the memory // back to us again, so that an efence program could run // longer without running out of memory. Unfortunately, // calling SysFree here without any kind of adjustment of the // heap data structures means that when the memory does // come back to us, we have the wrong metadata for it, either in // the MSpan structures or in the garbage collection bitmap. // Using SysFault here means that the program will run out of // memory fairly quickly in efence mode, but at least it won't // have mysterious crashes due to confused memory reuse. // It should be possible to switch back to SysFree if we also // implement and then call some kind of MHeap_DeleteSpan. if(runtime·debug.efence) { s->limit = nil; // prevent mlookup from finding this span runtime·SysFault(p, size); } else runtime·MHeap_Free(&runtime·mheap, s, 1); c->local_nlargefree ; c->local_largefree = size; runtime·xadd64(&mstats.next_gc, -(uint64)(size * (runtime·gcpercent 100)/100)); res = true; } else { // Free small object. if(size > 2*sizeof(uintptr)) ((uintptr*)p)[1] = (uintptr)0xdeaddeaddeaddeadll; // mark as "needs to be zeroed" else if(size > sizeof(uintptr)) ((uintptr*)p)[1] = 0; end->next = (MLink*)p; end = (MLink*)p; nfree ; } } // We need to set s->sweepgen = h->sweepgen only when all blocks are swept, // because of the potential for a concurrent free/SetFinalizer. // But we need to set it before we make the span available for allocation // (return it to heap or mcentral), because allocation code assumes that a // span is already swept if available for allocation. if(!sweepgenset && nfree == 0) { // The span must be in our exclusive ownership until we update sweepgen, // check for potential races. if(s->state != MSpanInUse || s->sweepgen != sweepgen-1) { runtime·printf("MSpan_Sweep: state=%d sweepgen=%d mheap.sweepgen=%d\n", s->state, s->sweepgen, sweepgen); runtime·throw("MSpan_Sweep: bad span state after sweep"); } runtime·atomicstore(&s->sweepgen, sweepgen); } if(nfree > 0) { c->local_nsmallfree[cl] = nfree; c->local_cachealloc -= nfree * size; runtime·xadd64(&mstats.next_gc, -(uint64)(nfree * size * (runtime·gcpercent 100)/100)); res = runtime·MCentral_FreeSpan(&runtime·mheap.central[cl].mcentral, s, nfree, head.next, end, preserve); // MCentral_FreeSpan updates sweepgen } return res; }

推荐阅读

list
微软头条实习生分享深度学习自学指南

本文介绍了一位微软头条实习生自学深度学习的经验分享，包括学习资源推荐、重要基础知识的学习要点等。作者强调了学好Python和数学基础的重要性，并提供了一些建议。 ... [详细]

蜡笔小新 2023-12-14 20:58:32
case
开发笔记：实验7的文件读写操作

本文介绍了使用C++的ofstream和ifstream类进行文件读写操作的方法，包括创建文件、写入文件和读取文件的过程。同时还介绍了如何判断文件是否成功打开和关闭文件的方法。通过本文的学习，读者可以了解如何在C++中进行文件读写操作。 ... [详细]

蜡笔小新 2023-12-12 17:48:18
java
Java中包装类的设计原因以及操作方法

本文主要介绍了Java中设计包装类的原因以及操作方法。在Java中，除了对象类型，还有八大基本类型，为了将基本类型转换成对象，Java引入了包装类。文章通过介绍包装类的定义和实现，解答了为什么需要包装类的问题，并提供了简单易用的操作方法。通过本文的学习，读者可以更好地理解和应用Java中的包装类。 ... [详细]

蜡笔小新 2023-12-12 15:48:10
java
学习SLAM的女生，很酷

本文介绍了学习SLAM的女生的故事，她们选择SLAM作为研究方向，面临各种学习挑战，但坚持不懈，最终获得成功。文章鼓励未来想走科研道路的女生勇敢追求自己的梦想，同时提到了一位正在英国攻读硕士学位的女生与SLAM结缘的经历。 ... [详细]

蜡笔小新 2023-12-14 17:55:18
java
在类中定义数组时出错 - Error on defining arrays in class

Iamtryingtomakeaclassthatwillreadatextfileofnamesintoanarray,thenreturnthatarra ... [详细]

蜡笔小新 2023-12-14 17:38:12
import
Java容器中的compareto方法排序原理解析

本文从源码解析Java容器中的compareto方法的排序原理，讲解了在使用数组存储数据时的限制以及存储效率的问题。同时提到了Redis的五大数据结构和list、set等知识点，回忆了作者大学时代的Java学习经历。文章以作者做的思维导图作为目录，展示了整个讲解过程。 ... [详细]

蜡笔小新 2023-12-14 13:53:31
sum
求解hdu 1003 java题目的动态规划优化方法

本文讨论了如何优化解决hdu 1003 java题目的动态规划方法，通过分析加法规则和最大和的性质，提出了一种优化的思路。具体方法是，当从1加到n为负时，即sum(1,n)sum(n,s)，可以继续加法计算。同时，还考虑了两种特殊情况：都是负数的情况和有0的情况。最后，通过使用Scanner类来获取输入数据。 ... [详细]

蜡笔小新 2023-12-14 13:11:00
java
JavaSE笔试题-接口、抽象类、多态等问题解答

本文解答了JavaSE笔试题中关于接口、抽象类、多态等问题。包括Math类的取整数方法、接口是否可继承、抽象类是否可实现接口、抽象类是否可继承具体类、抽象类中是否可以有静态main方法等问题。同时介绍了面向对象的特征，以及Java中实现多态的机制。 ... [详细]

蜡笔小新 2023-12-14 10:01:13
java
eclipse学习（第三章：ssh中的Hibernate）——11.Hibernate的缓存（2级缓存，get和load）

本文介绍了eclipse学习中的第三章内容，主要讲解了ssh中的Hibernate的缓存，包括2级缓存和get方法、load方法的区别。文章还涉及了项目实践和相关知识点的讲解。 ... [详细]

蜡笔小新 2023-12-14 00:31:35
sum
P1651 塔 (动态规划) 的最大高度计算方法

本文介绍了P1651题目的描述和要求，以及计算能搭建的塔的最大高度的方法。通过动态规划和状压技术，将问题转化为求解差值的问题，并定义了相应的状态。最终得出了计算最大高度的解法。 ... [详细]

蜡笔小新 2023-12-13 19:52:19
case
Python正则表达式学习记录及常用方法

本文记录了学习Python正则表达式的过程，介绍了re模块的常用方法re.search，并解释了rawstring的作用。正则表达式是一种方便检查字符串匹配模式的工具，通过本文的学习可以掌握Python中使用正则表达式的基本方法。 ... [详细]

蜡笔小新 2023-12-13 16:37:19
java
Java高并发程序设计学习-线程安全的概念与synchronized

本文介绍了Java高并发程序设计中线程安全的概念与synchronized关键字的使用。通过一个计数器的例子，演示了多线程同时对变量进行累加操作时可能出现的问题。最终值会小于预期的原因是因为两个线程同时对变量进行写入时，其中一个线程的结果会覆盖另一个线程的结果。为了解决这个问题，可以使用synchronized关键字来保证线程安全。 ... [详细]

蜡笔小新 2023-12-13 15:43:03
java
clone的fork与pthread_create创建线程有何不同

本文讨论了clone的fork与pthread_create创建线程的不同之处。进程是一个指令执行流及其执行环境，其执行环境是一个系统资源的集合。在调用系统调用fork创建一个进程时，子进程只是完全复制父进程的资源，这样得到的子进程独立于父进程，具有良好的并发性。但是二者之间的通讯需要通过专门的通讯机制，另外通过fork创建子进程系统开销很大。因此，在某些情况下，使用clone或pthread_create创建线程可能更加高效。 ... [详细]

蜡笔小新 2023-12-12 20:00:06
php
Windows操作系统的版本介绍及特点

本文介绍了Windows操作系统的版本及其特点，包括Windows 7系统的6个版本：Starter、Home Basic、Home Premium、Professional、Enterprise、Ultimate。Windows操作系统是微软公司研发的一套操作系统，具有人机操作性优异、支持的应用软件较多、对硬件支持良好等优点。Windows 7 Starter是功能最少的版本，缺乏Aero特效功能，没有64位支持，最初设计不能同时运行三个以上应用程序。 ... [详细]

蜡笔小新 2023-12-12 17:41:13
php
OO第一单元自白：简单多项式导函数的设计与bug分析

本文介绍了作者在学习OO的第一次作业中所遇到的问题及其解决方案。作者通过建立Multinomial和Monomial两个类来实现多项式和单项式，并通过append方法将单项式组合为多项式，并在此过程中合并同类项。作者还介绍了单项式和多项式的求导方法，并解释了如何利用正则表达式提取各个单项式并进行求导。同时，作者还对自己在输入合法性判断上的不足进行了bug分析，指出了自己在处理指数情况时出现的问题，并总结了被hack的原因。 ... [详细]

蜡笔小新 2023-12-12 14:51:21

岳述穗__

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章