Bin

Fastbin

大多数程序经常会申请以及释放比较小的内存块。如果将一些较小的chunk释放之后发现存
在与之相邻的空闲chunk并将它们进行合并,那么当下一次再次申请相应大小的chunk时,
就需要对chunk进行分割,这样就大大降低了堆的利用效率。因为我们把大部分时间花在了
合并、分割以及中间检查的过程中。因此,ptmalloc中专门设计了fastbin,对应的变量就是
malloc state中的fastbinY

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/*
Fastbins

An array of lists holding recently freed small chunks. Fastbins
are not doubly linked. It is faster to single-link them, and
since chunks are never removed from the middles of these lists,
double linking is not necessary. Also, unlike regular bins, they
are not even processed in FIFO order (they use faster LIFO) since
ordering doesn't much matter in the transient contexts in which
fastbins are normally used.

Chunks in fastbins keep their inuse bit set, so they cannot
be consolidated with other free chunks. malloc_consolidate
releases all chunks in fastbins and consolidates them with
other free chunks.
*/
typedef struct malloc_chunk *mfastbinptr;

/*
This is in malloc_state.
/* Fastbins */
mfastbinptr fastbinsY[ NFASTBINS ];
*/

为了更加高效地利用fastbin,glibc采用单向链表对其中的每一个Bin进行组织,并且每个
bin采取LIFO,最近释放的chunk会更早地被分配,所以会增加适用于局部性。也就是说,
当用户需要的chunk的大小小于fastbin的最大大小时,ptmalloc会首先判断fastbin中相应
的bin中是否有对应大小的空闲块,如果有的话,就去直接从这个bin中获取chunk。如果
没有的话,Ptmalloc才会做接下来的一系列操作。

默认情况下(32位系统为例),fastbin中默认支持最大的chunk的数据空间大小为64字节。
但是其可以支持的chunk的数据空间最大为80字节。除此之外,fastbin最多可以支持的Bin
的个数为10个,从数据空间为8个字节开始一直到80字节(主义这里说的是数据空间大小,
也即除去prev_size和size字段部分的大小)定义如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#define NFASTBINS (fastbin_index(request2size(MAX_FAST_SIZE)) + 1)

#ifndef DEFAULT_MXFAST
#define DEFAULT_MXFAST (64 * SIZE_SZ / 4)
#endif

/* The maximum fastbin request size we support */
#define MAX_FAST_SIZE (80 * SIZE_SZ / 4)

/*
Since the lowest 2 bits in max_fast don't matter in size comparisons,
they are used as flags.
*/

/*
FASTCHUNKS_BIT held in max_fast indicates that there are probably
some fastbin chunks. It is set true on entering a chunk into any
fastbin, and cleared only in malloc_consolidate.

The truth value is inverted so that have_fastchunks will be true
upon startup (since statics are zero-filled), simplifying
initialization checks.
*/
//判断分配区是否有 fast bin chunk,1表示没有
#define FASTCHUNKS_BIT (1U)

#define have_fastchunks(M) (((M)->flags & FASTCHUNKS_BIT) == 0)
#define clear_fastchunks(M) catomic_or(&(M)->flags, FASTCHUNKS_BIT)
#define set_fastchunks(M) catomic_and(&(M)->flags, ~FASTCHUNKS_BIT)

/*
NONCONTIGUOUS_BIT indicates that MORECORE does not return contiguous
regions. Otherwise, contiguity is exploited in merging together,
when possible, results from consecutive MORECORE calls.

The initial value comes from MORECORE_CONTIGUOUS, but is
changed dynamically if mmap is ever used as an sbrk substitute.
*/
// MORECORE是否返回连续的内存区域。
// 主分配区中的MORECORE其实为sbr(),默认返回连续虚拟地址空间
// 非主分配区使用mmap()分配大块虚拟内存,然后进行切分来模拟主分配区的行为
// 而默认情况下mmap映射区域是不保证虚拟地址空间连续的,所以非主分配区默认分配非连续虚拟地址空间。
#define NONCONTIGUOUS_BIT (2U)

#define contiguous(M) (((M)->flags & NONCONTIGUOUS_BIT) == 0)
#define noncontiguous(M) (((M)->flags & NONCONTIGUOUS_BIT) != 0)
#define set_noncontiguous(M) ((M)->flags |= NONCONTIGUOUS_BIT)
#define set_contiguous(M) ((M)->flags &= ~NONCONTIGUOUS_BIT)

/* ARENA_CORRUPTION_BIT is set if a memory corruption was detected on the
arena. Such an arena is no longer used to allocate chunks. Chunks
allocated in that arena before detecting corruption are not freed. */

#define ARENA_CORRUPTION_BIT (4U)

#define arena_is_corrupt(A) (((A)->flags & ARENA_CORRUPTION_BIT))
#define set_arena_corrupt(A) ((A)->flags |= ARENA_CORRUPTION_BIT)

/*
Set value of max_fast.
Use impossibly small value if 0.
Precondition: there are no existing fastbin chunks.
Setting the value clears fastchunk bit but preserves noncontiguous bit.
*/

#define set_max_fast(s) \
global_max_fast = \
(((s) == 0) ? SMALLBIN_WIDTH : ((s + SIZE_SZ) & ~MALLOC_ALIGN_MASK))
#define get_max_fast() global_max_fast

ptmalloc默认情况下会调用set_max_fast(s)将全局变量global_max_fast设置为DEFAULT_
MXFAST,也就是设置fast bins 中chunk的最大值。当MAX_FAST_SIZE被设置为0时,系统
就不会支持fastbin。

fastbin的索引

1
2
3
4
5
6
7
#define fastbin(ar_ptr, idx) ((ar_ptr)->fastbinsY[ idx ])

/* offset 2 to use otherwise unindexable first 2 bins */
// chunk size=2*size_sz*(2+idx)
// 这里要减2,否则的话,前两个bin没有办法索引到。
#define fastbin_index(sz) \
((((unsigned int) (sz)) >> (SIZE_SZ == 8 ? 4 : 3)) - 2)

需要特别注意的是,fastbin范围的chunk的inuse始终被置为1。因此它们不会和其它被释放
的chunk合并。
但是当释放的chunk与该chunk相邻的空闲chunk合并后的大小大于FASTBIN_CONSOLIDATION_
THRESHOLD时,内存碎片可能比较多了,我们就需要把fastbins中的chunk都进行合并,
以减少内存碎片对系统的影响。

1
2
3
4
5
6
7
8
9
10
11
12
/*
FASTBIN_CONSOLIDATION_THRESHOLD is the size of a chunk in free()
that triggers automatic consolidation of possibly-surrounding
fastbin chunks. This is a heuristic, so the exact value should not
matter too much. It is defined at half the default trim threshold as a
compromise heuristic to only attempt consolidation if it is likely
to lead to trimming. However, it is not dynamically tunable, since
consolidation reduces fragmentation surrounding large chunks even
if trimming is not used.
*/

#define FASTBIN_CONSOLIDATION_THRESHOLD (65536UL)

malloc_consolidate函数可以将fastbin中所有能和其它合并的chunk合并在一起。
具体地参见后续的详细函数的分析。

1
2
3
4
5
6
7
/*
Chunks in fastbins keep their inuse bit set, so they cannot
be consolidated with other free chunks. malloc_consolidate
releases all chunks in fastbins and consolidates them with
other free chunks.
*/

small bin
small bins中每个chunk的大小与其所在的bin的index的关系为:chunk_size=2*SIZE_SZ
*index,具体如下


下标 SIZE_SZ=4(32位) SIZE_SZ=8(64位)
2 16 32
3 24 48
4 32 64
5 40 80
x 24x 28x
63 504 1008


small bins中一共有62个循环双向链表,每个链表中存储的chunk大小都一致。比如对于32
位系统来说,下标2对应的双向链表中存储的chunk大小均为16个字节。每个链表都有链表头
结点,这样可以方便对于链表内部结点的管理。此外,small bins中每个Bin对应的链表采用
FIFO的规则,所以同一个链表中先被释放的chunk会先被分配出去。

small bin相关的宏如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#define NSMALLBINS 64
#define SMALLBIN_WIDTH MALLOC_ALIGNMENT
// 是否需要对small bin的下标进行纠正
#define SMALLBIN_CORRECTION (MALLOC_ALIGNMENT > 2 * SIZE_SZ)

#define MIN_LARGE_SIZE ((NSMALLBINS - SMALLBIN_CORRECTION) * SMALLBIN_WIDTH)
//判断chunk的大小是否在small bin范围内
#define in_smallbin_range(sz) \
((unsigned long) (sz) < (unsigned long) MIN_LARGE_SIZE)
// 根据chunk的大小得到small bin对应的索引。
#define smallbin_index(sz) \
((SMALLBIN_WIDTH == 16 ? (((unsigned) (sz)) >> 4) \
: (((unsigned) (sz)) >> 3)) + \
SMALLBIN_CORRECTION)

或许,大家会很疑惑,那fastbin与smallbin中的大小会有很大一部分重合啊,那smallbin中
对应大小的bin是不是就没有什么作用啊?其实不然,fastbin中的chunk是有可能被放到
smallbin中去的,我们在后面分析具体的源代码时会有深刻体会。

Largebin

largebins中一共包括63个bin,每一个bin中的chunk的大小不一致,而是处于一定区间范围
内。此外,这63个bin被分成了6组,每组bin中的chunk大小之间的公差一致,具体如下:


组 数量 公差
1 32 64B
2 16 512B
3 8 4096B
4 4 32768B
5 2 262144B
6 1 不限制


这里我们以32位平台的largebin为例,第一个largebin的起始chunk大小为512个字节,位于
第一组,所以该bin可以存储的chunk的大小范围为[512,512+64)

关于large bin的宏如下,这里我们以32位平台下,第一个largebin的起始chunk大小为例子,
为512字节,那么512>>6=8,所以其下标为56+8=64。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#define largebin_index_32(sz)                                                  \
(((((unsigned long) (sz)) >> 6) <= 38) \
? 56 + (((unsigned long) (sz)) >> 6) \
: ((((unsigned long) (sz)) >> 9) <= 20) \
? 91 + (((unsigned long) (sz)) >> 9) \
: ((((unsigned long) (sz)) >> 12) <= 10) \
? 110 + (((unsigned long) (sz)) >> 12) \
: ((((unsigned long) (sz)) >> 15) <= 4) \
? 119 + (((unsigned long) (sz)) >> 15) \
: ((((unsigned long) (sz)) >> 18) <= 2) \
? 124 + (((unsigned long) (sz)) >> 18) \
: 126)

#define largebin_index_32_big(sz) \
(((((unsigned long) (sz)) >> 6) <= 45) \
? 49 + (((unsigned long) (sz)) >> 6) \
: ((((unsigned long) (sz)) >> 9) <= 20) \
? 91 + (((unsigned long) (sz)) >> 9) \
: ((((unsigned long) (sz)) >> 12) <= 10) \
? 110 + (((unsigned long) (sz)) >> 12) \
: ((((unsigned long) (sz)) >> 15) <= 4) \
? 119 + (((unsigned long) (sz)) >> 15) \
: ((((unsigned long) (sz)) >> 18) <= 2) \
? 124 + (((unsigned long) (sz)) >> 18) \
: 126)

// XXX It remains to be seen whether it is good to keep the widths of
// XXX the buckets the same or whether it should be scaled by a factor
// XXX of two as well.
#define largebin_index_64(sz) \
(((((unsigned long) (sz)) >> 6) <= 48) \
? 48 + (((unsigned long) (sz)) >> 6) \
: ((((unsigned long) (sz)) >> 9) <= 20) \
? 91 + (((unsigned long) (sz)) >> 9) \
: ((((unsigned long) (sz)) >> 12) <= 10) \
? 110 + (((unsigned long) (sz)) >> 12) \
: ((((unsigned long) (sz)) >> 15) <= 4) \
? 119 + (((unsigned long) (sz)) >> 15) \
: ((((unsigned long) (sz)) >> 18) <= 2) \
? 124 + (((unsigned long) (sz)) >> 18) \
: 126)

#define largebin_index(sz) \
(SIZE_SZ == 8 ? largebin_index_64(sz) : MALLOC_ALIGNMENT == 16 \
? largebin_index_32_big(sz) \
: largebin_index_32(sz))

unsorted bin

unsorted bin可以视为空闲chunk回归其所属bin之前的缓冲区。
其在glibc中具体的说明如下

1
2
3
4
5
6
7
8
9
10
11
12
13
/*
Unsorted chunks

All remainders from chunk splits, as well as all returned chunks,
are first placed in the "unsorted" bin. They are then placed
in regular bins after malloc gives them ONE chance to be used before
binning. So, basically, the unsorted_chunks list acts as a queue,
with chunks being placed on it in free (and malloc_consolidate),
and taken off (to be either used or placed in bins) in malloc.

The NON_MAIN_ARENA flag is never set for unsorted chunks, so it
does not have to be taken into account in size comparisons.
*/

从下面的宏我们可以看出

1
2
/* The otherwise unindexable 1-bin is used to hold unsorted chunks. */
#define unsorted_chunks(M) (bin_at(M, 1))

unsorted bin 处于我们之前所说的bin数组下标1处。故而unsortedbin只有一个链表。
unsorted bin中空闲的chunk处于乱序状态,主要有两个来源
①当一个较大的chunk被分割成两半后,如果剩下的部分大于MINSIZE,就会被放到unsorted
bin中。
②释放一个不属于fast bin的chunk,并且该chunk不和top chunk紧邻时,该chunk会被放到
unsorted bin中。关于top chunk的解释,请参考下面的介绍。

此外,Unsorted bin在使用的过程中,采用的遍历顺序是FIFO。

common macro

这里介绍一些通用的宏。
根据chunk的大小统一地获得chunk所在的索引

1
2
#define bin_index(sz)                                                          \
((in_smallbin_range(sz)) ? smallbin_index(sz) : largebin_index(sz))

本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!