12

A Deep dive into OpenBSD malloc(3) internals

 4 years ago
source link: https://bsdb0y.github.io/blog/deep-dive-into-the-OpenBSD-malloc-and-friends-internals-part-1.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Tale of OpenBSD secure memory allocator internals - malloc(3)

nEzYbye.png!web Fig.1 malloc debugging with gdb

Hi there,

It's been a very long time I haven't written anything after my last OpenBSD blogs, that is,

So, again I started reading OpenBSD source codes with debugger after reducing my sleep timings and managing to get some time after professional life. This time I have picked one of my favourite item from my wishlist to learn and share, that is, OpenBSDmalloc(3), secure allocator

I will try to keep it as n part series due to lengthy content and this series will be mostly focussed on user-space code ofmalloc(3) and friends

First of all, I would like to thanks Otto Moerbeek , Bryan Steele and Fabien Romano for helping me to understand the malloc(3) internals and cleared all my queries.

So, we should start now... :)

I have used the following sample code to start my journey with the OpenBSD 6.6-stable malloc(3)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int
main(int argc, char **argv) {
	char *buff1 = NULL;
	buff1 = (char *)malloc(8);
	strcpy(buff1, argv[1]);
	free(buff1);
	return 0;
}

sample code used for learning internals

Compiled the libc with debug symbols and switched off the optimization by compiling it with clang option "-O0 -g".

Followed the below steps to compile the libc with debug symbols

1. cd $openbsd_src_directory
2. cd lib/libc
3. CFLAGS="-g -O0" CXXFLAGS="-g -O0" make obj
4. CFLAGS="-g -O0" CXXFLAGS="-g -O0" make

I haven't installed it and used gdb-style debugging approach for code reading instead of debugging with printf style debugging.

For printf style debugging, one can use dprintf(3) or write(2) calls to print anything but keep in mind that after installation of libc with printf statements it will dump lots of information for each and every binary, so installation step is no use for malloc(3) debugging.

I would say one can either compile and use LD_PRELOAD to load the compiled libc and use debugger or one can compile with printf statements then use the same LD_PRELOAD for that specific sample code.

So, just after the malloc(3) from the sample code, it directly jumps to function malloc(size_t size) from file /usr/src/lib/libc/stdlib/malloc.c:1278

1277 void *
1278 malloc(size_t size)
1279 {
1280         void *r;
1281         struct dir_info *d;
1282         int saved_errno = errno;
1283
1284         PROLOGUE(getpool(), "malloc")
1285         r = omalloc(d, size, 0, CALLER);
1286         EPILOGUE()
1287         return r;
1288 }

code snippet #00

As explained by the developer Otto@ in the tweet on twitter that struct dir_info contains all the meta information malloc needs to keep track of what page regions have been allocated, which pages regions are in the free-cache and for pages of chunks which chunks are free and which are allocated.

So, as per the code given above, one can see after some basic initialization-declaration, there is some macro PROLOGUE and EPILOGUE. It means the same like how it sounds, usually these two means,

prologue:

Function prologue . In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function. Here, instead of preparing the stack and registers, it initializes some other necessary malloc requirements that we will see later.

epilogue:

Function epilogue reverses the actions of the function prologue and returns control to the calling function.

PROLGOUE macro given below:

1256 #define PROLOGUE(p, fn)                 \
1257         d = (p);                        \
1258         if (d == NULL) {                \
1259                 _malloc_init(0);        \
1260                 d = (p);                \
1261         }                               \
1262         _MALLOC_LOCK(d->mutex);         \
1263         d->func = fn;                   \
1264         if (d->active++) {              \
1265                 malloc_recurse(d);      \
1266                 return NULL;            \
1267         }                               \

code snippet #01

As from the, we can see that p is getpool() and fn is the function string, that is, "malloc" and from the, we can see that it first calls getpool() function and the code for getpool() given below

270 static inline struct dir_info *
271 getpool(void)
272 {
273         if (!mopts.malloc_mt)
274                 return mopts.malloc_pool[1];
275         else    /* first one reserved for special pool */
276                 return mopts.malloc_pool[1 + TIB_GET()->tib_tid %
277                     (mopts.malloc_mutexes - 1)];
278 }

code snippet #02

In thewe can see that first it checks whether it has multi-threads or single-threads. And, in our case it is single threaded binary, so after if it returns mopts.malloc_pool[1] which is NULL .

Now, after that from theit checks whether d is NULL or not. In our case it is NULL as at first it will always be NULL after that it calls _malloc_init(0) but when d is not NULL then it assigns fn to d->func then check and increment the d->active then calls malloc_recurse(d) and returns NULL but for our situation it is not the flow.

Code for _malloc_init(0) given below

1207 void
1208 _malloc_init(int from_rthreads)
1209 {
1210         u_int i, nmutexes;
1211         struct dir_info *d;
1212 
1213         _MALLOC_LOCK(1);
1214         if (!from_rthreads && mopts.malloc_pool[1]) {
1215                 _MALLOC_UNLOCK(1);
1216                 return;
1217         }
1218         if (!mopts.malloc_canary)
1219                 omalloc_init();
1220 
1221         nmutexes = from_rthreads ? mopts.malloc_mutexes : 2;
1222         if (((uintptr_t)&malloc_readonly & MALLOC_PAGEMASK) == 0)
1223                 mprotect(&malloc_readonly, sizeof(malloc_readonly),
1224                     PROT_READ | PROT_WRITE);
1225         for (i = 0; i < nmutexes; i++) {
1226                 if (mopts.malloc_pool[i])
1227                         continue;
1228                 if (i == 0) {
1229                         omalloc_poolinit(&d, MAP_CONCEAL);
1230                         d->malloc_junk = 2;
1231                         d->malloc_cache = 0;
1232                 } else {
1233                         omalloc_poolinit(&d, 0);
1234                         d->malloc_junk = mopts.def_malloc_junk;
1235                         d->malloc_cache = mopts.def_malloc_cache;
1236                 }
1237                 d->mutex = i;
1238                 mopts.malloc_pool[i] = d;
1239         }
1240 
1241         if (from_rthreads)
1242                 mopts.malloc_mt = 1;
1243         else
1244                 mopts.internal_funcs = 1;
1245 
1246         /*
1247          * Options have been set and will never be reset.
1248          * Prevent further tampering with them.
1249          */
1250         if (((uintptr_t)&malloc_readonly & MALLOC_PAGEMASK) == 0)
1251                 mprotect(&malloc_readonly, sizeof(malloc_readonly), PROT_READ);
1252         _MALLOC_UNLOCK(1);
1253 }

code snippet #03

In the, we can see that first it does some MALLOC_LOCKing and checks for the from_rthreads and also for mopts.malloc_pool[1] as an outcome it is not true due to mopts.malloc_pool[1] which is NULL

Then, after that it checks for mopts.malloc_canary and it wil be always 0 first time due to structure initialized to zero (0) and then it calls function omalloc_init() and code for the same given below

405	static void
406	omalloc_init(void)
407	{
408		char *p, *q, b[16];
409		int i, j, mib[2];
410		size_t sb;
411
412		/*
413		 * Default options
414		 */
415		mopts.malloc_mutexes = 8;
416		mopts.def_malloc_junk = 1;
417		mopts.def_malloc_cache = MALLOC_DEFAULT_CACHE;
418
419		for (i = 0; i < 3; i++) {
420			switch (i) {
421			case 0:
422				mib[0] = CTL_VM;
423				mib[1] = VM_MALLOC_CONF;
424				sb = sizeof(b);
425				j = sysctl(mib, 2, b, &sb, NULL, 0);
426				if (j != 0)
427					continue;
428				p = b;
429				break;
430			case 1:
431				if (issetugid() == 0)
432					p = getenv("MALLOC_OPTIONS");
433				else
434					continue;
435				break;
436			case 2:
437				p = malloc_options;
438				break;
439			default:
440				p = NULL;
441			}
442
443			for (; p != NULL && *p != '\0'; p++) {
444				switch (*p) {
445				case 'S':
446					for (q = "CFGJ"; *q != '\0'; q++)
447						omalloc_parseopt(*q);
448					mopts.def_malloc_cache = 0;
449					break;
450				case 's':
451					for (q = "cfgj"; *q != '\0'; q++)
452						omalloc_parseopt(*q);
453					mopts.def_malloc_cache = MALLOC_DEFAULT_CACHE;
454					break;
455				default:
456					omalloc_parseopt(*p);
457					break;
458				}
459			}
460		}
461
462	#ifdef MALLOC_STATS
463		if (mopts.malloc_stats && (atexit(malloc_exit) == -1)) {
464			dprintf(STDERR_FILENO, "malloc() warning: atexit(2) failed."
465			    " Will not be able to dump stats on exit\n");
466		}
467	#endif /* MALLOC_STATS */
468
469		while ((mopts.malloc_canary = arc4random()) == 0)
470			;
471	}

code snippet #04

OpenBSD malloc(3) has lots of feature and they are configurable using systcl(8) , environment variable MALLOC_OPTIONS and compile-time option malloc_options

So, let's suppose one want to use the canary option then one has to set the option using systcl(8), like I have used it from the command: systcl vm.malloc_conf=C

omalloc_init() does the following things:

  • Initializes some variables of malloc_readonly structure to default values like mopts.malloc_mutexes is the default number of mutexes, mopts.def_malloc_junk is the default number of junk filling and mopts.def_malloc_cache is the default number of free page cache

  • Then, it checks one by one for all 3 ways that is mentioned above for setting a malloc option. In our case, I have used systcl(8) option, so for sysctl it goes to case 0: and get the value to char b[16] and then it assigns to pointer to character variable p then after looping for other two it goes for extracting the value that is there in p

  • First it checks for the malloc option S which enables all the security auditing features of OpenBSD malloc(3)

    • If it is there then it calls omalloc_parseopt(*q); function then after that it sets mopts.def_malloc_cache to 0 or to MALLOC_DEFAULT_CACHE , depends on whether it is S or s
    • If it is not there, then it simply calls the function omalloc_parseopt(*p);

function omalloc_parseopt(char opt) extracts the character and in our case it is C for malloc canary, so after parsing, it goes to case 'C' and sets mopts.chunk_canaries to 1 , it does the same for other characters and sets or enables/initializes their variables. Code for the same function given below:

322	static void
323	omalloc_parseopt(char opt)
324	{
325		switch (opt) {
326		case '+':
327			mopts.malloc_mutexes <<= 1;
328			if (mopts.malloc_mutexes > _MALLOC_MUTEXES)
329				mopts.malloc_mutexes = _MALLOC_MUTEXES;
330			break;
331		case '-':
332			mopts.malloc_mutexes >>= 1;
333			if (mopts.malloc_mutexes < 2)
334				mopts.malloc_mutexes = 2;
335			break;
336		case '>':
337			mopts.def_malloc_cache <<= 1;
338			if (mopts.def_malloc_cache > MALLOC_MAXCACHE)
339				mopts.def_malloc_cache = MALLOC_MAXCACHE;
340			break;
341		case '<':
342			mopts.def_malloc_cache >>= 1;
343			break;
344		case 'c':
345			mopts.chunk_canaries = 0;
346			break;
347		case 'C':
348			mopts.chunk_canaries = 1;
349			break;
350	#ifdef MALLOC_STATS
351		case 'd':
352			mopts.malloc_stats = 0;
353			break;
354		case 'D':
355			mopts.malloc_stats = 1;
356			break;
357	#endif /* MALLOC_STATS */
358		case 'f':
359			mopts.malloc_freecheck = 0;
360			mopts.malloc_freeunmap = 0;
361			break;
362		case 'F':
363			mopts.malloc_freecheck = 1;
364			mopts.malloc_freeunmap = 1;
365			break;
366		case 'g':
367			mopts.malloc_guard = 0;
368			break;
369		case 'G':
370			mopts.malloc_guard = MALLOC_PAGESIZE;
371			break;
372		case 'j':
373			if (mopts.def_malloc_junk > 0)
374				mopts.def_malloc_junk--;
375			break;
376		case 'J':
377			if (mopts.def_malloc_junk < 2)
378				mopts.def_malloc_junk++;
379			break;
380		case 'r':
381			mopts.malloc_realloc = 0;
382			break;
383		case 'R':
384			mopts.malloc_realloc = 1;
385			break;
386		case 'u':
387			mopts.malloc_freeunmap = 0;
388			break;
389		case 'U':
390			mopts.malloc_freeunmap = 1;
391			break;
392		case 'x':
393			mopts.malloc_xmalloc = 0;
394			break;
395		case 'X':
396			mopts.malloc_xmalloc = 1;
397			break;
398		default:
399			dprintf(STDERR_FILENO, "malloc() warning: "
400	                    "unknown char in MALLOC_OPTIONS\n");
401			break;
402		}
403	}

code snippet #05

There are some ifdef MALLOC_STATS in the entire source, which means it dumps extra malloc stats information whenever requires. So, as from the, after completing omalloc_parseopt(char opt) , there is some MALLOC_STATS condition then after them, a random cookie assigns to mopts.malloc_canary using arc4random()

After the completion of omalloc_init() , coming again to, there is a variable named nmutexes which sets bydefault to 2 for single-threaded programs, we have already seen the initalization of mopts.malloc_mutexes default to 8 for multi-threaded programs in the function omalloc_init() . Here, for single-threaded, the value of variable from_rthread is 0

Then, it checks for MALLOC_PAGEMASK bit for malloc_readonly address or simply we can say that it is checking the last 12 bits of the malloc_readonly structure address and all 12 bits must be 0 to apply perm bit PROT_READ and PROT_WRITE . I would also say it is also checking for even address

After that, there is a loop till the nmutexes which is 2 for our case, single-threaded. For single-threaded programs, it maintains two malloc dir_info pools as indicates from the variable nmutexes . One for MAP_CONCEALED memory (#0) and one for regular (#1). For multi-threaded porgram more pools are created. This is to avoid contention, accesses to diffrent pools can run concurently [ref. [1] related to multi-pools - mailing list [2] github diff for more pools ]

omalloc_poolinit(&d, MAP_CONCEAL); when if i == 0 for MAP_CONCEALED memory and else , omalloc_poolinit(&d, 0); for regular memory pools.

omalloc_poolinit()code given below:

473 static void
474 omalloc_poolinit(struct dir_info **dp, int mmap_flag)
475 {
476         char *p;
477         size_t d_avail, regioninfo_size;
478         struct dir_info *d;
479         int i, j;
480
481         /*
482          * Allocate dir_info with a guard page on either side. Also
483          * randomise offset inside the page at which the dir_info
484          * lies (subject to alignment by 1 << MALLOC_MINSHIFT)
485          */
486         if ((p = MMAPNONE(DIR_INFO_RSZ + (MALLOC_PAGESIZE * 2), mmap_flag)) ==
487             MAP_FAILED)
488                 wrterror(NULL, "malloc init mmap failed");
489         mprotect(p + MALLOC_PAGESIZE, DIR_INFO_RSZ, PROT_READ | PROT_WRITE);
490         d_avail = (DIR_INFO_RSZ - sizeof(*d)) >> MALLOC_MINSHIFT;
491         d = (struct dir_info *)(p + MALLOC_PAGESIZE +
492             (arc4random_uniform(d_avail) << MALLOC_MINSHIFT));
493
494         rbytes_init(d);
495         d->regions_free = d->regions_total = MALLOC_INITIAL_REGIONS;
496         regioninfo_size = d->regions_total * sizeof(struct region_info);
497         d->r = MMAP(regioninfo_size, mmap_flag);
498         if (d->r == MAP_FAILED) {
499                 d->regions_total = 0;
500                 wrterror(NULL, "malloc init mmap failed");
501         }
502         for (i = 0; i <= MALLOC_MAXSHIFT; i++) {
503                 LIST_INIT(&d->chunk_info_list[i]);
504                 for (j = 0; j < MALLOC_CHUNK_LISTS; j++)
505                         LIST_INIT(&d->chunk_dir[i][j]);
506         }
507         STATS_ADD(d->malloc_used, regioninfo_size + 3 * MALLOC_PAGESIZE);
508         d->mmap_flag = mmap_flag;
509         d->malloc_junk = mopts.def_malloc_junk;
510         d->malloc_cache = mopts.def_malloc_cache;
511         d->canary1 = mopts.malloc_canary ^ (u_int32_t)(uintptr_t)d;
512         d->canary2 = ~d->canary1;
513
514         *dp = d;
515 }

code snippet #06

omalloc_poolinit(struct dir_inf **dp, int mmap_flag) does the following things:

After some basic variable declarations, first it maps page into memory using mmap(2) in MMAPNONE(sz, f) where sz is the size and f is the flag

96 #define MMAPNONE(sz,f)  mmap(NULL, (sz), PROT_NONE, \
97     MAP_ANON | MAP_PRIVATE | (f), -1, 0)

MMAPNONE() macro defination

So, as from the above macro defination, we can see that the first parameter of mmap(2), that is, addr is NULL

The mmap() function causes the contents of fd, starting at offset, to be
     mapped in memory at the given addr.  The mapping will extend at least len
     bytes, subject to page alignment restrictions.

     The addr argument describes the address where the system should place the
     mapping.  If the MAP_FIXED flag is specified, the allocation will happen
     at the specified address, replacing any previously established mappings
     in its range.  Otherwise, the mapping will be placed at the available
     spot at addr; failing that it will be placed "close by".  If addr is NULL
     the system can pick any address.  Except for MAP_FIXED mappings, the
     system will never replace existing mappings.

     The len argument describes the minimum amount of bytes the mapping will
     span.  Since mmap() maps pages into memory, len may be rounded up to hit
     a page boundary.  If len is 0, the mapping will fail with EINVAL.

mmap(2) man page

So below are the figures that I got it from gdb during debugging

sz = DIR_INFO_RSZ + (MALLOC_PAGESIZE * 2)
where DIR_INFO_RSZ = ((sizeof(struct dir_info) + MALLOC_PAGEMASK) & ~MALLOC_PAGEMASK)
MALLOC_PAGEMASK = 4095 bytes
MALLOC_PAGESIZE = 4096 bytes
So, DIR_INFO_RSZ becomes (4284 + 4095) & ~4095 = 8192 bytes
and sz becomes 8192 + (4096 * 2) = 16384 bytes
and, f = MMAP_CONCEAL; which means it prevents from leaking the information and also it excludes the mapping from core dumps

page calculations

So, as from the above page mapping calculations, DIR_INFO_RSZ indicates that to store the dir_info structure we need 2 pages and allocate 4 pages PROT_NONE using MMAPNONE with flag as "MAP_CONCEAL", which can be seen in

Then, making two middle pages R/W which can be seen from the macro value of DIR_INFO_RSZ, 2 pages

then, calculate the d_avail to randomise offset inside the page at which the dir_info lies (subject to alignment by 1<<MALLOC_MINSHIFT)

d_avail = (8192 - 4824) >> 4 = 3368 >> 4 = 210

d_avail variable

the d_avail is 210, now calculate offset, d

d = (struct dir_info *)(p + MALLOC_PAGESIZE + (arc4random_uniform(d_avail) << MALLOC_MINSHIFT));

d offset calculation

So, as mentioned by the developer Otto@ in the mailing list that the dir_info structure ends up on an aligned address somewhere in the middle pages on an offset between 0 and (210<<4) = 0..3360, counting from the start of the two middle pages.So, it becomes like d = [p + MALLOC_PAGESIZE + (0..3360)], where (0..3360) is the offset value from 0 to 3360 max

As from the code, in my opinion, it looks like there is no guard page bydefault on the both the sides, however, it is there only one side but after discussing with the developer Otto@ . He said that dir_info is special and having a guard page on both sides for regular allocation can be done, but would waste more pages. He mentioned to note that allocations are already spread throughout the address space, so it is very likely that an allocation is surrounded by unmapped pages . So, finally he mentioned that there will be guard page on each side

Then it initializes random bytes using the rbytes_init(d) function. Code for the same give below:

303 static void
304 rbytes_init(struct dir_info *d)
305 {
306         arc4random_buf(d->rbytes, sizeof(d->rbytes));
307         /* add 1 to account for using d->rbytes[0] */
308         d->rbytesused = 1 + d->rbytes[0] % (sizeof(d->rbytes) / 2);
309 }

rbytes_init(struct dir_info *d);

It is clearly seen from the above code that it is filling the d->rbytes with the random bytes using the arc4random_buf() . Then, keeps the tracks/count of the rbytes used with d->rbytesused

After that as from the, it is initializing the default values for some of the structure members, d->regions_free = dir->regions_total = MALLOC_INTIAL_REGIONS; where the value of macro MALLOC_INITIAL_REGIONS is 512 . Also, calculates the total region_info size, regioninfo_size = d->regions_total * sizeof(struct region_info);

105 struct region_info {
106         void *p;                /* page; low bits used to mark chunks */
107         uintptr_t size;         /* size for pages, or chunk_info pointer */
108 #ifdef MALLOC_STATS
109         void *f;                /* where allocated from */
110 #endif
111 };

struct region_info

The structure struct region_info is used to keep track of mmap’ed regions by storing their address and size into a hash table as mentioned in the slides of Otto@

Then, it maps pages of size regioninfo_size for regions with the flag MAP_CONCEAL and assigns it to d->r and then checks if mapping failed

And, one more query that I had during understanding the source code of malloc(3) is about the code snippet given below [taken from the code snippet# 06]

...
...
...
502         for (i = 0; i <= MALLOC_MAXSHIFT; i++) {
503                 LIST_INIT(&d->chunk_info_list[i]);
504                 for (j = 0; j < MALLOC_CHUNK_LISTS; j++)
505                         LIST_INIT(&d->chunk_dir[i][j]);
506         }
...
...
...

code snippet - lists creation to allow randomization

What is the purpose of creating these many linked lists? So, after discussing with the Otto@ , I came to know that these many lists is used to allow more randomization. As Otto@ said "More than one list of free chunk pages per chunk size is maintained to allow for more randomization."

So, in short the above nested for loops will create 12 chunk_info_list where i is 0 to 11 and for each and every ith index there is j, so as per that,

chunk_info_list[0]
	chunk_dir[0][0]
	chunk_dir[0][1]
	chunk_dir[0][2]
	chunk_dir[0][3]
...
...
...
chunk_info_list[11]
	chunk_dir[11][0]
	chunk_dir[11][1]
	chunk_dir[11][2]
	chunk_dir[11][3]

...
...
...

list creation sample example to visualize

One can also allow more randomization by increasing the MALLOC_CHUNK_LISTS but at the cost of overhead. Increasing the MALLOC_MAXSHIFT is not possible because it is the shift of the max chunk size that fits in a page. We will see later how these lists helps to achieve randomization

Then, adds and updated the d->malloc_used for dumping the malloc stats information

Then, updated the default initialization of variables given below:
...
...
...
508         d->mmap_flag = mmap_flag;
509         d->malloc_junk = mopts.def_malloc_junk;
510         d->malloc_cache = mopts.def_malloc_cache;
511         d->canary1 = mopts.malloc_canary ^ (u_int32_t)(uintptr_t)d;
512         d->canary2 = ~d->canary1;
513
514         *dp = d;
515 }
	/* where it is already mentioned above code snippets that mopts.def_malloc_junk is 1 and
	 *  mopts.def_malloc_cache is 64, already initialized default values in omalloc_init() 
	 */

code snippet #07

Then, dir_info uses two canaries. There calculation is very easy. Whatever the address of d ( struct dir_info *d) just typecast it to (u_int32_t)(uintptr_t) and XOR it with mopts.malloc_canary and assign the result to d->canary1 and then compute the not (~) of d->canary1 then assign the result to d->canary2 or in simpler words, convert into binary then take each bit and flips them to their logical opposite. convert 1's to 0's and vice versa then convert it to hex and assign it to d->canary2

Finally assign the initialized dir_info structure d to *dp

We are still in function _malloc_init(int from_rthreads) , so, as per the, for i == 0 , we have completed the function omalloc_poolinit(&d, MAP_CONCEAL); , now after that, it assigns d->malloc_junk = 2; and d->malloc_cache = 0;

Then, it saves the index i to d->mutex and stored the initialized MAP_CONCEALED pool to mopts.malloc_pool[i] = d;

The value of nmutexes is 2, so, now loop will again do the whole same operations for index i == 1 or we can say for i != 0 .So, this time also it invokes omalloc_poolinit(&d, 0) but with no flag, remember, last time it was MAP_CONCEAL, this time, it is 0 . So, omalloc_poolinit(&d, 0) performs the same operations but only the diff is with no flag value or with flag value 0

After completion of function omalloc_poolinit(&d, 0) , it sets junk value to default junk value and same for malloc_cache, that is, d->malloc_junk = mopts.def_malloc_junk; and d->malloc_cache = mopts.def_malloc_cache; . Now, again for i = 1 it does the same thing, that is, saving the new index 1 to d->mutex and stores the pool to mopts.malloc_pool[i] = d

So, now mopts.malloc_pool has two pools, mopts.malloc_pool[0] and mopts.malloc_pool[1]

Next, it checks for multi-threading from variable from_rthreads , if the value is non-zero then it sets mopts.malloc_mt = 1 which shows that program is multi-threaded but for our case it is zero (0), so, it doesn't go in that code flow and then the else code flow executes and sets mopts.internal_funcs = 1 , from the structure malloc_readonly , it shows that setting internal_funcs means to use better function like recallocarray/freezero but I haven't seen them for our case, maybe they have used it for some other scenarios like in case of calloc, etc. recallocarray(3) - mailing list

Finally it sets the perms to readonly, to prevent further tampering with them, again it checks last 12 bits and if they are 0 then it sets perms to PROT_READ, and _malloc_init(0) completed

Now, we go back to the PROGLOGUE macro code in the, I have pasted the code again:

1256 #define PROLOGUE(p, fn)                 \
1257         d = (p);                        \
1258         if (d == NULL) {                \
1259                 _malloc_init(0);        \
1260                 d = (p);                \
1261         }                               \
1262         _MALLOC_LOCK(d->mutex);         \
1263         d->func = fn;                   \
1264         if (d->active++) {              \
1265                 malloc_recurse(d);      \
1266                 return NULL;            \
1267         }                               \

revisited PROLOGUE macro code

So, as from the PROLOGUE code, we can see that we have completed the malloc_init(0) .Now, if we remember correctly, here p refers to the function getpool() , calls getpool() function again and this time also due to single-threaded program, it returns mopts.malloc_pool[1] , which has the regular pool, not MAP_CONCEALED one, return the same to d then malloc_lock and assigns fn to d->fn , here fn means func string "malloc". It checks and incremented the d->active and doesn't go inside the if code-flow due to 0 value of d->active

As one can see from the, I have pasted it again below

1277 void *
1278 malloc(size_t size)
1279 {
1280         void *r;
1281         struct dir_info *d;
1282         int saved_errno = errno;
1283
1284         PROLOGUE(getpool(), "malloc")
1285         r = omalloc(d, size, 0, CALLER);
1286         EPILOGUE()
1287         return r;
1288 }

revisited malloc code

From line no. 1285, we can see that after PROLOGUE it calls omalloc(d, size, CALLER) . Code for omalloc(d, size, CALLER) given below:

1126	static void *
1127	omalloc(struct dir_info *pool, size_t sz, int zero_fill, void *f)
1128	{
1129		void *p;
1130		size_t psz;
1131
1132		if (sz > MALLOC_MAXCHUNK) {
1133			if (sz >= SIZE_MAX - mopts.malloc_guard - MALLOC_PAGESIZE) {
1134				errno = ENOMEM;
1135				return NULL;
1136			}
1137			sz += mopts.malloc_guard;
1138			psz = PAGEROUND(sz);
1139			p = map(pool, NULL, psz, zero_fill);
1140			if (p == MAP_FAILED) {
1141				errno = ENOMEM;
1142				return NULL;
1143			}
1144			if (insert(pool, p, sz, f)) {
1145				unmap(pool, p, psz, 0, 0);
1146				errno = ENOMEM;
1147				return NULL;
1148			}
1149			if (mopts.malloc_guard) {
1150				if (mprotect((char *)p + psz - mopts.malloc_guard,
1151				    mopts.malloc_guard, PROT_NONE))
1152					wrterror(pool, "mprotect");
1153				STATS_ADD(pool->malloc_guarded, mopts.malloc_guard);
1154			}
1155
1156			if (MALLOC_MOVE_COND(sz)) {
1157				/* fill whole allocation */
1158				if (pool->malloc_junk == 2)
1159					memset(p, SOME_JUNK, psz - mopts.malloc_guard);
1160				/* shift towards the end */
1161				p = MALLOC_MOVE(p, sz);
1162				/* fill zeros if needed and overwritten above */
1163				if (zero_fill && pool->malloc_junk == 2)
1164					memset(p, 0, sz - mopts.malloc_guard);
1165			} else {
1166				if (pool->malloc_junk == 2) {
1167					if (zero_fill)
1168						memset((char *)p + sz - mopts.malloc_guard,
1169						    SOME_JUNK, psz - sz);
1170					else
1171						memset(p, SOME_JUNK,
1172						    psz - mopts.malloc_guard);
1173				} else if (mopts.chunk_canaries)
1174					fill_canary(p, sz - mopts.malloc_guard,
1175					    psz - mopts.malloc_guard);
1176			}
1177
1178		} else {
1179			/* takes care of SOME_JUNK */
1180			p = malloc_bytes(pool, sz, f);
1181			if (zero_fill && p != NULL && sz > 0)
1182				memset(p, 0, sz);
1183		}
1184
1185		return p;
1186	}

code snippet #08

omalloc(struct dir_info *pool, size_t sz, int zero_fill, void *f); does the following things...

There are two code flow paths, 1st , if ( sz > MALLOC_MAXCHUNK) . I have used size 8 in the sample code, so it won't go to this code flow path, but still I am going to explain but not in-depth. So, first it checks if the size is greater than the MALLOC_MAXCHUNK , which is 2048. If size is less, then it uses 2nd code flow, that is, it calls malloc_bytes(pool, sz, f); which we will see later, for now I am going to cover the case when size of greater, in the following points

After checking the size with MALLOC_MAXCHUNK then it checks if the requested size is bigger than SIZE_MAX (excluding mopts.malloc_guard and MALLOC_PAGESIZE), if yes, then it sets errno to ENOMEM

Then, it adds the malloc_guard value to requested size, then it rounds the page to multiple of pagesize, that is, if requested size is<= 4096 then it rounds it to 4096 but if requested size is 4097 then it rounds it to 8192

Then, it maps the page to size psz with hint address is equals to NULL then check if it fails

Then, it uses insert(pool, p, sz, f); function call to keep track of mmap’ed regions by storing their address and size into a hash table, we will se more in-depth code flow later, if it fails to insert then it unmaps the page and return ENOMEM

Then, it checks for malloc_guard page option, if it sets then it adds a guarded page as per the size mentioned by malloc_guard variable with PROT_NONE perms bit and then it adds the information for stats

Then, it checks for MALLOC_MOVE_COND , if the condition is true then it shifts the allocations towards the end, as mentioned in the source code comment,

70 /*
71  * We move allocations between half a page and a whole page towards the end,
72  * subject to alignment constraints. This is the extra headroom we allow.
73  * Set to zero to be the most strict.
74  */
75 #define MALLOC_LEEWAY           0
76 #define MALLOC_MOVE_COND(sz)    ((sz) - mopts.malloc_guard <            \
77                                     MALLOC_PAGESIZE - MALLOC_LEEWAY)
78 #define MALLOC_MOVE(p, sz)      (((char *)(p)) +                        \
79                                     ((MALLOC_PAGESIZE - MALLOC_LEEWAY - \
80                                     ((sz) - mopts.malloc_guard)) &      \
81                                     ~(MALLOC_MINSIZE - 1)))

macro definitions for moving allocations

After checking for condition, first, it checks for junking value, that is, pool->malloc_junk == 2 if it equals then it fills the mmap'ed page p with SOME_JUNK which has bits 0xdb , then as previously mentioned that it shifts the allocation towards end with macro MALLOC_MOVE(p, sz) , then later on it checks if zero-filling requires, if it requires then it fills the same JUNKed filled page with zeroes

But, if the MALLOC_MOVE_COND , condition is not true then it simple checks for junking value, that is, pool->malloc_junk == 2 , if it equals to 2 then it checks for zero filling and if it requires then it fills from the address as mentioned in the code snippet given below if { ... }, and if zero filling not requires then it simply fills with SOME_JUNK as mentioned in the else { ... } code flow

...
		...
		...
1166				if (pool->malloc_junk == 2) {
1167					if (zero_fill)
1168						memset((char *)p + sz - mopts.malloc_guard,
1169						    SOME_JUNK, psz - sz);
1170					else
1171						memset(p, SOME_JUNK,
1172						    psz - mopts.malloc_guard);

junking checks

So, if the junking value is equals to 2, that is, pool->malloc_junk == 2 , this code flow is covered in the previous point but if it fails or not equals to 2 then it checks for canaries, that is, else if (mopts.chunk_canaries) then it calls the fill_canary function to filling the canaries

1st code flow completed, now 2nd code flow , that is, if (sz<MALLOC_MAXCHUNK) means if requested size is less than the MALLOC_MAXCHUNK, previously in 1st code flow it was greater than the MALLOC_MAXCHUNK

It calls the function malloc_bytes(pool, sz, f); and the source code for the same given below

948	/*
949	 * Allocate a chunk
950	 */
951	static void *
952	malloc_bytes(struct dir_info *d, size_t size, void *f)
953	{
954		u_int i, r;
955		int j, listnum;
956		size_t k;
957		u_short	*lp;
958		struct chunk_info *bp;
959		void *p;
960	
961		if (mopts.malloc_canary != (d->canary1 ^ (u_int32_t)(uintptr_t)d) ||
962		    d->canary1 != ~d->canary2)
963			wrterror(d, "internal struct corrupt");
964	
965		j = find_chunksize(size);
966	
967		r = ((u_int)getrbyte(d) << 8) | getrbyte(d);
968		listnum = r % MALLOC_CHUNK_LISTS;
969		/* If it's empty, make a page more of that size chunks */
970		if ((bp = LIST_FIRST(&d->chunk_dir[j][listnum])) == NULL) {
971			bp = omalloc_make_chunks(d, j, listnum);
972			if (bp == NULL)
973				return NULL;
974		}
975	
976		if (bp->canary != (u_short)d->canary1)
977			wrterror(d, "chunk info corrupted");
978	
979		i = (r / MALLOC_CHUNK_LISTS) & (bp->total - 1);
980	
981		/* start somewhere in a short */
982		lp = &bp->bits[i / MALLOC_BITS];
983		if (*lp) {
984			j = i % MALLOC_BITS;
985			k = ffs(*lp >> j);
986			if (k != 0) {
987				k += j - 1;
988				goto found;
989			}
990		}
991		/* no bit halfway, go to next full short */
992		i /= MALLOC_BITS;
993		for (;;) {
994			if (++i >= bp->total / MALLOC_BITS)
995				i = 0;
996			lp = &bp->bits[i];
997			if (*lp) {
998				k = ffs(*lp) - 1;
999				break;
1000			}
1001		}
1002	found:
1003	#ifdef MALLOC_STATS
1004		if (i == 0 && k == 0) {
1005			struct region_info *r = find(d, bp->page);
1006			r->f = f;
1007		}
1008	#endif
1009	
1010		*lp ^= 1 << k;
1011	
1012		/* If there are no more free, remove from free-list */
1013		if (--bp->free == 0)
1014			LIST_REMOVE(bp, entries);
1015	
1016		/* Adjust to the real offset of that chunk */
1017		k += (lp - bp->bits) * MALLOC_BITS;
1018	
1019		if (mopts.chunk_canaries && size > 0)
1020			bp->bits[bp->offset + k] = size;
1021	
1022		k <<= bp->shift;
1023	
1024		p = (char *)bp->page + k;
1025		if (bp->size > 0) {
1026			if (d->malloc_junk == 2)
1027				memset(p, SOME_JUNK, bp->size);
1028			else if (mopts.chunk_canaries)
1029				fill_canary(p, size, bp->size);
1030		}
1031		return p;
1032	}

code snippet #09

malloc_bytes(pool, sz, f); is used to allocate a chunk. After some basic variable declaration, it checks for internal struct corruption with canaries of d and mopts.malloc_canary as mentioned and calculated in the

Then, it finds the chunk size with the function find_chunksize(size); . Code given below

919 static int
 920 find_chunksize(size_t size)
 921 {
 922         int r;
 923
 924         /* malloc(0) is special */
 925         if (size == 0)
 926                 return 0;
 927
 928         if (size < MALLOC_MINSIZE)
 929                 size = MALLOC_MINSIZE;
 930         size--;
 931
 932         r = MALLOC_MINSHIFT;
 933         while (size >> r)
 934                 r++;
 935         return r;
 936 }

code snippet #10

As from the, first it will check if given size is 0 then it returns 0 , then it checks if size is less than MALLOC_MINSIZE then it sets size to MALLOC_MINSIZE, else, it decrements the size and sets r = MALLOC_MINSHIFT; and the value of MALLOC_MINSHIFT is 4 . Then it right shifts the size by r and it keeps incrementing the r till right shifting results becomes zero, then returns r . In our case, the size is 8, which is less than MALLOC_MINSIZE and then due to it is less than MALLOC_MISIZE, it sets the size to MALLOC_MINSIZE, which is 16, then it decrements and it becomes 15 and 15 >> 4 , r is MALLOC_MINSHIFT, which is 4. So, output of 15 >>4 is 0. So, it returns r, which is 4

As we can see from the, after find_chunksize(size), it calls ((u_int)getrbyte(d)<<8) | getrbyte(d); , so, getrbyte(d) given below:

311 static inline u_char
 312 getrbyte(struct dir_info *d)
 313 {
 314         u_char x;
 315 
 316         if (d->rbytesused >= sizeof(d->rbytes))
 317                 rbytes_init(d);
 318         x = d->rbytes[d->rbytesused++];
 319         return x;
 320 }

code snippet #11

So, in the above code, first it checks whether the rbytesused is more than the sizeof(rbytes), if yes, then it initializes the random bytes using rbytes_init(d) and then assigns the value of d->rbytes[d->rbytesused] to x and also increments the d->rbytesused count and returns the same. But if the condition in line no. 316 fails then it will directly assigns the rbytesused count to x and increment the same and return the same x

So as per the name indicates, I think it as getrbyte() means get-random-byte(), basically the whole line r = ((u_int)getrbyte(d)<<8) | getrbyte(d); does the following stuff...

get-random-byte then left shits by 8 then OR'ed with again some random byte
		let's suppose,
		1) a =  get-random-bytes
		2) left shift a by 8 
		3) b = get-random-bytes
		4) r = a | b
		for example, 0xd4 -> 0xd400 -> 0xd4cb, so r = 0xd4cb

explanation snippet

Then it finds the listnum, listnum = r % MALLOC_CHUNK_LISTS; , where MALLOC_CHUNK_LISTS is 4

As from the[code snippet - lists creation to allow randmization], we have seen that there is nested for loops which creates lists, helpful to allow randomization. So, here we can see that it randomly chooses the created lists for allocation of chunk_info

Then, it chooses some random list and makes it first list, now if the list is empty, that is, head is NULL then it calls function omalloc_make_chunks(d, j, listnum); to allocate page of chunks. Code for omalloc_make_chunks given below

885 /*
 886  * Allocate a page of chunks
 887  */
 888 static struct chunk_info *
 889 omalloc_make_chunks(struct dir_info *d, int bits, int listnum)
 890 {
 891         struct chunk_info *bp;
 892         void *pp;
 893
 894         /* Allocate a new bucket */
 895         pp = map(d, NULL, MALLOC_PAGESIZE, 0);
 896         if (pp == MAP_FAILED)
 897                 return NULL;
 898
 899         /* memory protect the page allocated in the malloc(0) case */
 900         if (bits == 0 && mprotect(pp, MALLOC_PAGESIZE, PROT_NONE) == -1)
 901                 goto err;
 902
 903         bp = alloc_chunk_info(d, bits);
 904         if (bp == NULL)
 905                 goto err;
 906         bp->page = pp;
 907
 908         if (insert(d, (void *)((uintptr_t)pp | (bits + 1)), (uintptr_t)bp,
 909             NULL))
 910                 goto err;
 911         LIST_INSERT_HEAD(&d->chunk_dir[bits][listnum], bp, entries);
 912         return bp;
 913
 914 err:
 915         unmap(d, pp, MALLOC_PAGESIZE, 0, d->malloc_junk);
 916         return NULL;
 917 }

code snippet #12

The above code is used to allocate a page of chunks, which is already mentioned in the comments

First, it maps/allocates page up to len MALLOC_PAGESIZE with hint addr as NULL using map(d, NULL, MALLOC_PAGESIZE, 0); , then it checks for its failure and returns NULL if fails

Code for map given below

751 static void *
752 map(struct dir_info *d, void *hint, size_t sz, int zero_fill)
753 {
754	size_t psz = sz >> MALLOC_PAGESHIFT;
755	struct region_info *r, *big = NULL;
756	u_int i;
757	void *p;
758 
759	if (mopts.malloc_canary != (d->canary1 ^ (u_int32_t)(uintptr_t)d) ||
760	    d->canary1 != ~d->canary2)
761		wrterror(d, "internal struct corrupt");
762	if (sz != PAGEROUND(sz))
763		wrterror(d, "map round");
764 
765	if (hint == NULL && psz > d->free_regions_size) {
766		_MALLOC_LEAVE(d);
767		p = MMAP(sz, d->mmap_flag);
768		_MALLOC_ENTER(d);
769		if (p != MAP_FAILED)
770			STATS_ADD(d->malloc_used, sz);
771		/* zero fill not needed */
772		return p;
773	}
774	for (i = 0; i < d->malloc_cache; i++) {
775		r = &d->free_regions[(i + d->rotor) & (d->malloc_cache - 1)];
776		if (r->p != NULL) {
777			if (hint != NULL && r->p != hint)
778				continue;
779			if (r->size == psz) {
780				p = r->p;
781				r->p = NULL;
782				d->free_regions_size -= psz;
783				if (mopts.malloc_freeunmap)
784					mprotect(p, sz, PROT_READ | PROT_WRITE);
785				if (zero_fill)
786					memset(p, 0, sz);
787				else if (d->malloc_junk == 2 &&
788				    mopts.malloc_freeunmap)
789					memset(p, SOME_FREEJUNK, sz);
790				d->rotor += i + 1;
791				return p;
792			} else if (r->size > psz)
793				big = r;
794		}
795	}
796	if (big != NULL) {
797		r = big;
798		p = r->p;
799		r->p = (char *)r->p + (psz << MALLOC_PAGESHIFT);
800		if (mopts.malloc_freeunmap)
801			mprotect(p, sz, PROT_READ | PROT_WRITE);
802		r->size -= psz;
803		d->free_regions_size -= psz;
804		if (zero_fill)
805			memset(p, 0, sz);
806		else if (d->malloc_junk == 2 && mopts.malloc_freeunmap)
807			memset(p, SOME_FREEJUNK, sz);
808		return p;
809	}
810	if (hint != NULL)
811		return MAP_FAILED;
812	if (d->free_regions_size > d->malloc_cache)
813		wrterror(d, "malloc cache");
814	_MALLOC_LEAVE(d);
815	p = MMAP(sz, d->mmap_flag);
816	_MALLOC_ENTER(d);
817	if (p != MAP_FAILED)
818		STATS_ADD(d->malloc_used, sz);
819	/* zero fill not needed */
820	return p;
821 }

code for map(struct dir_info *d, void *hint, size_t sz, int zero_fill)

From the, we can see that the parameters for the map() function given below:

calls map(d, NULL, MALLOC_PAGESIZE, 0);
		After corelating the map() calling from <a href="#code_snippet_12">[code snippet #12]</a> and <a href="#code_map">map() function defination</a>, 
		→ first parameter, d belongs to struct dir_info *d
		→ second parameter, NULL belongs to *hint
		→ third parameter, MALLOC_PAGESIZE belongs to sz
		→ fourth parameter, 0 indicates zero_filling requirements

corelation of map() paramters

First, map() assigns the right shifted value of sz by MALLOC_PAGESHIFT bytes, which is 4096 >> 12 => 0x1. So, psz = 0x1. Then, it defines some pointer objects for the structure region_info and some other pointer variables, later, it checks for internal struct corruption through canaries. Then, it verifies the roundness of size sz. if it passes, then it checks whether the provided hint is NULL and psz is greater than free_pages_cache. So, for our case, the d->free_regions_size is 0x0 and psz is 0x1, so, the condition satisfies and it comes under if{...} code flow, where if{...} code flow first does MALLOC_LEAVE, which means it checks if it is multi-threaded program then decrement the d->active count and unlock malloc and returns but if it is single-threaded, like for our case, then it simply returns without doing anything

Then, it maps pages with provided flag mmap_flag and size sz. mmap_flag is 0x0, as if we remember correctly then second time getpool() returns d with mmap_flag is 0x0. Then, it does MALLOC_ENTER which is the opposite of MALLOC_LEAVE. It means check if the program is multi-threaded then lock malloc and increment the d->active count but if the program is single-threaded then it does nothing and simply returns. Then , it checks whether p is failed. If not then it adds stats information, that is, increments the d->malloc_used to sz, that is, d->malloc_used += sz then it returns

For our case, map() completed, but other code flows may be explained later in the upcoming malloc series, as the blog has already become lengthy

Then, coming back to, if the bits is 0 means for malloc(0) case, it memory protects the allocated page with PROT_NONE but for our case, bits is 4 and we have used malloc(8) Then, it allocates the chunk info by calling the function alloc_chunk_info(d, bits); and code for the same given below:
847 static struct chunk_info *
 848 alloc_chunk_info(struct dir_info *d, int bits)
 849 {
 850         struct chunk_info *p;
 851
 852         if (LIST_EMPTY(&d->chunk_info_list[bits])) {
 853                 size_t size, count, i;
 854                 char *q;
 855
 856                 if (bits == 0)
 857                         count = MALLOC_PAGESIZE / MALLOC_MINSIZE;
 858                 else
 859                         count = MALLOC_PAGESIZE >> bits;
 860
 861                 size = howmany(count, MALLOC_BITS);
 862                 size = sizeof(struct chunk_info) + (size - 1) * sizeof(u_short);
 863                 if (mopts.chunk_canaries)
 864                         size += count * sizeof(u_short);
 865                 size = _ALIGN(size);
 866
 867                 q = MMAP(MALLOC_PAGESIZE, d->mmap_flag);
 868                 if (q == MAP_FAILED)
 869                         return NULL;
 870                 STATS_ADD(d->malloc_used, MALLOC_PAGESIZE);
 871                 count = MALLOC_PAGESIZE / size;
 872
 873                 for (i = 0; i < count; i++, q += size) {
 874                         p = (struct chunk_info *)q;
 875                         LIST_INSERT_HEAD(&d->chunk_info_list[bits], p, entries);
 876                 }
 877         }
 878         p = LIST_FIRST(&d->chunk_info_list[bits]);
 879         LIST_REMOVE(p, entries);
 880         if (p->shift == 0)
 881                 init_chunk_info(d, p, bits);
 882         return p;
 883 }

code snippet #13

First it checks if the list is empty where bits is 4 for our case, and if it is, then, it checks if bits is 0, if yes, then it calculates the count by dividing the MALLOC_PAGESIZE with MALLOC_MINSIZE but in our case bits is equals to 4, so, it calculates count for all cases when bit != 0 by MALLOC_PAGESIZE >> bits , which comes as 256 for our case

Then, howmany(x, y) is ((x) + ((y) - 1) / y), so as per that, x = count and y MALLOC_BITS, which is 256 and 16 respectively for our case and the result is 16

Then, it calculates size from adding the sizeof(struct chunk_info) + (size - 1) * sizeof(u_short), for our case, it came as 40 + (16 - 1) * 2 = 70 and then it checks for mopt.chunk_canaries and if it is there then it calculates size += count * sizeof(u_short); so, I have compiled the sample code with malloc option C , so, chunk_canaries is there, for our case size became, size = 70 + 256 * 2 = 582 then it aligns the size to _ALIGN(size), which make it from 582 to 584

  • In simpler words, the above point explains the size calculations for a chunk. It includes the size = [size of structure chunk_info] + [ size of bitmap] and as we already know that the bits will be different for different chunk size, so, the size of bitmap varies as per the different bits. It also adds the size of canary, if they are enable

Then, it maps page using mmap(2) upto MALLOC_PAGESIZE with mmap_flag value as 0 due to the mopts.malloc_pool[1], mopts.malloc_pool[0] has MAP_CONCEALed memory.Then, it adds information about malloc usage count from adding MALLOC_PAGESIZE to variable d->malloc_used. Then, it calculates count from count = MALLOC_PAGESIZE / size where it is 4096 / 584 = 7, for our case

Then, it creates count number of lists for chunk_info. For our case, count is 7 so, it creates 7 lists with the list head updated with p. Then, it chooses first list or in other words we can say it chooses the last inserted head list then removes it from the lists. Then it checks for p->shift and here shift means how may shifts it will take to get the p->size, if it is zero (0) then it calls function init_chunk_info(d, p, bits); then after the initializaton it returns p . Code for the same given below:

823 static void
 824 init_chunk_info(struct dir_info *d, struct chunk_info *p, int bits)
 825 {
 826         int i;
 827
 828         if (bits == 0) {
 829                 p->shift = MALLOC_MINSHIFT;
 830                 p->total = p->free = MALLOC_PAGESIZE >> p->shift;
 831                 p->size = 0;
 832                 p->offset = 0xdead;
 833         } else {
 834                 p->shift = bits;
 835                 p->total = p->free = MALLOC_PAGESIZE >> p->shift;
 836                 p->size = 1U << bits;
 837                 p->offset = howmany(p->total, MALLOC_BITS);
 838         }
 839         p->canary = (u_short)d->canary1;
 840
 841         /* set all valid bits in the bitmap */
 842         i = p->total - 1;
 843         memset(p->bits, 0xff, sizeof(p->bits[0]) * (i / MALLOC_BITS));
 844         p->bits[i / MALLOC_BITS] = (2U << (i % MALLOC_BITS)) - 1;
 845 }

code snippet #14

From the above code snippet, it is used to initialize the chunk information. There it is checking for whether the value of bits is 0 or not 0. If 0 then it initializes some default value as mentioned in the above code. And, it seems that bit == 0, is the case of malloc(0). But, if it is not 0 then it initializes some other values as mentioned in the above code. Then, it assigns dir_info canary1 to chunk canary with typecasts in u_short, that is p->canary = (u_short)d->canary1;

Then, set all the valid bits in the bitmap. For our case, i = p->total - 1; , where p->total = 256 and i = 255 then it copies the 0xff to p->bits upto sizeof(p->bits[0]) * (i / MALLOC_BITS)) - 1 and it came 30 for our case.

Then it assigns some value to p->bits, where a bitmap specifing which chunks are still free to use, for better explanation on p->bits , please read the text mentioned, for our case, i = 255, MALLOC_BITS = 16, so, it became,

p->bits[255 / 16] = (2U << ( 255 % 16)) - 1
		then, p->bits[15] = (2U << ( 15 )) - 1
		then, p->bits[15] = 65536 - 1
		then, p->bits[15] = 65535

bitmap calculations explanation

Now, the chunk initializaton is completed. As from the, we can see that after the function init_chunk_info(d, p, bits); , it returns the initialized chunk p

Now, after the completion of both the functions init_chunk_info() and alloc_chunk_info() , it returns the allocated and initialized chunk to the function omalloc_make_chunks() , so, now we will proceed the same but in reverse direction, just like how stack works, nestedly creating of stacks from 0..n now going back from n..0, in other words, simply we can say now, we proceed by returning from all the functions that we came from. Returning the new initialized and allocated chunk p to bp , which we can see from the. So, after line no. 903 it checks bp for NULL and it bp is not NULL then it assigns the new page that it creates on line no. 895 , that is, bp->page = pp; . Then, it uses insert(d, (void *)((uintptr_t)pp | (bits + 1)), (uintptr_t)bp, NULL) and the code for insert(struct dir_info *d, void *p, size_t sz, void *f) given below:

564 /*
565  * The hashtable uses the assumption that p is never NULL. This holds since
566  * non-MAP_FIXED mappings with hint 0 start at BRKSIZ.
567  */
568 static int
569 insert(struct dir_info *d, void *p, size_t sz, void *f)
570 {
571         size_t index;
572         size_t mask;
573         void *q;
574
575         if (d->regions_free * 4 < d->regions_total) {
576                 if (omalloc_grow(d))
577                         return 1;
578         }
579         mask = d->regions_total - 1;
580         index = hash(p) & mask;
581         q = d->r[index].p;
582         STATS_INC(d->inserts);
583         while (q != NULL) {
584                 index = (index - 1) & mask;
585                 q = d->r[index].p;
586                 STATS_INC(d->insert_collisions);
587         }
588         d->r[index].p = p;
589         d->r[index].size = sz;
590 #ifdef MALLOC_STATS
591         d->r[index].f = f;
592 #endif
593         d->regions_free--;
594         return 0;
595 }

code snippet #15

As we have already seen above aboutthat this function uses to keep track of mmap’ed regions by storing their address and size into a hash table, in-depth explanation of the code given below

From the, we can see that initially it checks for whether the slots is too filled, which means more than 75% (>75%), if yes, then it calls omalloc_grow(d); and returns 1. If no, means it is less 75% (<75%) then it decrements d->regions_total by 1 and assign it to mask variable. Then it calculates, index, through anding (&) a hash(p) function and mask . hash(p) function given below:

254 static inline size_t
 255 hash(void *p)
 256 {
 257         size_t sum;
 258         uintptr_t u;
 259
 260         u = (uintptr_t)p >> MALLOC_PAGESHIFT;
 261         sum = u;
 262         sum = (sum << 7) - sum + (u >> 16);
 263 #ifdef __LP64__
 264         sum = (sum << 7) - sum + (u >> 32);
 265         sum = (sum << 7) - sum + (u >> 48);
 266 #endif
 267         return sum;
 268 }

hash(p) function

hash(p) does some shifting calculations as one can see from the above mentioned code. Then, it returns that sum and '&' with mask that is calculated previously. Here, d->r refers to the structure struct region_info , which has address and size to maintain mmap'ed regions into hash table. After that it assigns the already existing address to q, that is, q = d->r[index].p . If q is not NULL then it again recalulcates index and then finds the q and increment the collisions information for stats. if q is NULL, then it simply stores the new addr, p, to struct region_info, same for size, then it decrements the free slots count

Then, as from the, it inserts list head LIST_INSERT_HEAD(&d->chunk_dir[bits][listnum], bp, entries); , where bits is 4, for our case and listnum is random for each invocaton then returns the associated meta-info, bp with the allocated-initialized chunk information and page of chunks, bp->page

.

In my understanding as from the, the bp seems to be the allocated chunks in a page, and pp seems to be the bucket of chunks, or in other words, a page of chunks as we can see from theon line no. 895

Now, coming back to the main code, that is, malloc_bytes in. Then, it checks for chunk canary corruption on line no. 976, if(bp->canary != (u_short)d->canary1) , if not equal then calls wrterror() with the string "chunk info corrupted" and abort()

Then, it calculates, i , i = (r / MALLOC_CHUNK_LISTS) & (bp->total - 1); , where, r is the random byte, MALLOC_CHUNK_LISTS is 4 and bp->total is the total no. of chunks. Then, after calculating the i it calculates the index of the chunk which is free, as we know that bp->bits is responsible for tracking the free chunks to use

Also, in support of the code, I have found one of the interesting old paper to provided some details about the phkmalloc internas. So, as per the paper it is mentioned that

struct pginfo {
    struct pginfo       *next;  /* next on the free list */
    void                *page;  /* Pointer to the page */
    u_short             size;   /* size of this page's chunks */
    u_short             shift;  /* How far to shift for this size chunks */
    u_short             free;   /* How many free chunks */
    u_short             total;  /* How many chunk */
    u_int               bits[1]; /* Which chunks are free */
};

The next field is a pointer to the next structure in the list, or 0 if the
element is the last of the list.

The page field points to the beginning of the page (it is always a multiple of
malloc_pagesize).

The size field is set to the size in bytes of the chunks contained in this page.

The free field is the number of free chunks in the page.

The total field is the sum of the number of free chunks in the page and of the
number of allocated chunks in the page. The page is freed (ie. returned to the
lower layer) whenever ( free == total ) becomes true.

<b>At last, the bits field is a variable size array indicating which chunks in the
page are free chunks. Each chunk is associated with a unique bit in the field,
namely the bit obtained when applying the ( 1 << n ) mask to bits[i] where:

  ( i * MALLOC_BITS ) + n = ( chunk_address & malloc_pagemask ) / chunk_size

MALLOC_BITS is the number of bits a u_int has on the specific architecture phk
malloc is running on. It is defined as follows:

#define MALLOC_BITS     ((int)(8*sizeof(u_int)))

This bit is set to one when the chunk is free, and to 0 when the chunk is in
use. Of course, the effective size of the bits field will depend on the number
of chunks in the page.</b>

some common details from phkmalloc and omalloc

Now, the calculations given below computes the offset of the chunk in the page

...
...
...
981		/* start somewhere in a short */
982		lp = &bp->bits[i / MALLOC_BITS];
983		if (*lp) {
984			j = i % MALLOC_BITS;
985			k = ffs(*lp >> j);
986			if (k != 0) {
987				k += j - 1;
988				goto found;
989			}
990		}
991		/* no bit halfway, go to next full short */
992		i /= MALLOC_BITS;
993		for (;;) {
994			if (++i >= bp->total / MALLOC_BITS)
995				i = 0;
996			lp = &bp->bits[i];
997			if (*lp) {
998				k = ffs(*lp) - 1;
999				break;
1000			}
1001		}
1002	found:
1003	#ifdef MALLOC_STATS
1004		if (i == 0 && k == 0) {
1005			struct region_info *r = find(d, bp->page);
1006			r->f = f;
1007		}
1008	#endif
1009	
1010		*lp ^= 1 << k;
1011	
1012		/* If there are no more free, remove from free-list */
1013		if (--bp->free == 0)
1014			LIST_REMOVE(bp, entries);
1015	
1016		/* Adjust to the real offset of that chunk */
1017		k += (lp - bp->bits) * MALLOC_BITS;
1018	
1019		if (mopts.chunk_canaries && size > 0)
1020			bp->bits[bp->offset + k] = size;
1021	
1022		k <<= bp->shift;
1023	
1024		p = (char *)bp->page + k;
1025		if (bp->size > 0) {
1026			if (d->malloc_junk == 2)
1027				memset(p, SOME_JUNK, bp->size);
1028			else if (mopts.chunk_canaries)
1029				fill_canary(p, size, bp->size);
1030		}
1031		return p;
1032	}

code for offset computation in the page

It assigns the address of bp->bits[13] to lp, for our case, i = 0xd3 (or 211) and MALLOC_BITS is 16. then , if *lp is not null then it assigns i % MALLOC_BITS to j, which is j = 3, then, it goes to find the first set bit and assign it to k, k = ffs(*lp >> j); , code given below for demostration

1 /* ffs -- Find the first bit set in the parameter
 2
 3 @deftypefn Supplemental int ffs (int @var{valu})
 4
 5 Find the first (least significant) bit set in @var{valu}.  Bits are
 6 numbered from right to left, starting with bit 1 (corresponding to the
 7 value 1).  If @var{valu} is zero, zero is returned.
 8
 9 @end deftypefn
10
11 */
12
13 int
14 ffs (register int valu)
15 {
16   register int bit;
17
18   if (valu == 0)
19     return 0;
20
21   for (bit = 1; !(valu & 1); bit++)
22         valu >>= 1;
23
24   return bit;
25 }

ffs(register int valu)

For our case, *lp is 0xffff and j is 3, so *lp >> j is 8191 and ffs(8191) is 1 and return to k. And, if k != 0 then k = k + j - 1 => k = 1 + 3 - 1 => k = 3, then goto found

*lp = *lp ^ 1<<k, then 0xffff ^ 1<<0x3 => *lp = 0xfff7. Here, it sets the bits of the bitmap. Then, if bp->free is 0 then remove lists bp and decrement the bp->free value, if not 0 then also decrement the bp->free and k = k + (lp - bp->bits) * MALLOC_BITS; => k = 0xd3 (or 211). Then, check if mopts.chunk_canaries is enable and size > 0, if yes, then bp->bits[bp->offset + k] = size => bp->bits[16 + 211] = 0x8

Then, k<<= bp->shift. It means k = k<<bp->shift => k = 0xd3 (or 211) * 16 => k = 0xd30 (or 3376). Then, calculate offset in the page, p = (char *)bp->page + k; , for our case, bp->page is 0xcde2d918000 and k is 0xd30 => p = 0xcde2d918d30. Then it checks if the bp->size > 0, if yes, then check for more junking through, d->malloc_junk == 2 , if more junking set, then it copies SOME_JUNK to p till the bp->size, but if no more_junking option set then it checks for mopt.chunk_canaries, and if canary is enable then, fill_canary functions calls with p, size and bp->size

Here, in short, information about the offset calculations

1. bp->page or pp is the page of chunks and bp is the associated meta-info
		2. k is the chunk number, before shiting, which means k is the index for the chunk in the page bp->page
		3. After shifting, k becomes the byte offset of the chunk inside the bp->page page

inshort summary of the page calculations

Code for fill_canary() function given below

938 static void
939 fill_canary(char *ptr, size_t sz, size_t allocated)
940 {
941         size_t check_sz = allocated - sz;
942
943         if (check_sz > CHUNK_CHECK_LENGTH)
944                 check_sz = CHUNK_CHECK_LENGTH;
945         memset(ptr + sz, SOME_JUNK, check_sz);
946 }

fill_canary - canary filling code

For our case, check_sz = 16 - 8, then, check_sz = 8, then check for maximum chunk length and if greater than CHUNK_CHECK_LENGTH then sets the same. Then, copy the SOME_JUNK to ptr + sz till size check_sz , for our case, ptr + sz = 0xcde2d918d38, where ptr=0xcde2d918d30 and sz=0x8 copies SOME_JUNK till check_sz which has 0x10 length. Now, if in the case of heap-overflow, if any buffer tries to write more than size sz, that is, 0x8 then it corrupts the SOME_JUNK value and that detects as heap-overflow attack and abort(). Rest about canary validation we will see on further series of malloc friends

Then, finally it returns the page + offset address filled with canary

Now, the malloc_bytes function is completed and as we can see further in omalloc() ,. After the malloc_bytes call, it checks for zero_filling and also checks if p != NULL and sz > 0. If all conditions meet then it copies zero to entire address p till size sz and returns p

So, now as from the, After omalloc() function, it calls for.Following the code explains about the EPILOGUE() macro

1269 #define EPILOGUE()                              \
1270         d->active--;                            \
1271         _MALLOC_UNLOCK(d->mutex);               \
1272         if (r == NULL && mopts.malloc_xmalloc)  \
1273                 wrterror(d, "out of memory");   \
1274         if (r != NULL)                          \
1275                 errno = saved_errno;            \

EPILOGUE() macro - code snippet #16

So, as from the above code, we can see that EPILOGUE() does the following things

  • It decrements the active count

  • Unlock the locked stuff

  • Then, it checks if r is NULL and also checks for mopts.malloc_xmalloc then it "prints out of memory" as error and abort(). But if r is not NULL then it saves saved_errno to errno variable and completes the epilogue() process

Now, we have reached the end of the malloc(3) internals. So, if we recall the, After EPILOGUE(), it returns the r , which has the address value, 0xcde2d918d30 , (char *)bp->page + k as mentioned in theline no. 1024

Now coming back to the[sample code for malloc(3) internals], as we can see that after buff1 = (char *)malloc(8); , it copies the user controlled input to buff1 and then calls free(buff1)

Following is the gdb window to show canary and canary overwrite, from the same sample code

openbsd# LD_PRELOAD=/usr/src/lib/libc/obj/libc.so.95.1 gdb -q sample
(gdb) br main
Breakpoint 1 at 0x1363: file sample.c, line 7.
(gdb) r AAAABBBBCCCCDDDD
Starting program: /root/test/sample AAAABBBBCCCCDDDD
Breakpoint 1 at 0x4517aff7363: file sample.c, line 7.
Error while reading shared library symbols:
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /usr/libexec/ld.so]

Breakpoint 1, main (argc=2, argv=0x7f7ffffee678) at sample.c:7
7		char *buff1 = NULL;
Current language:  auto; currently minimal
(gdb) n
8		buff1 = (char *)malloc(8);
(gdb) n
9		strcpy(buff1, argv[1]);
(gdb) x/16x buff1
0x453ec9b6fe0:	0x00000000	0x00000000	0xdbdbdbdb	0xdbdbdbdb
0x453ec9b6ff0:	0x00000000	0x00000000	0x00000000	0x00000000
0x453ec9b7000:	Cannot access memory at address 0x453ec9b7000
(gdb) n
10		free(buff1);
(gdb) x/16x buff1
0x453ec9b6fe0:	0x41414141	0x42424242	0x43434343	0x44444444
0x453ec9b6ff0:	0x00000000	0x00000000	0x00000000	0x00000000
0x453ec9b7000:	Cannot access memory at address 0x453ec9b7000
(gdb) c
Continuing.
sample(3165) in free(): chunk canary corrupted 0x453ec9b6fe0 0x8@0x8

Program received signal SIGABRT, Aborted.
thrkill () at -:3
3	-: No such file or directory.
	in -
Current language:  auto; currently asm
(gdb)

gdb showing the canary corruption and malloc aborting the process

In the above debugging window, one can see that just after the malloc(3) allocation, there are canaries with 0xdb bytes, which is filled by the fill_canary() as explained in the code snippet -fill_canary - canary filling code

Then, after the memory allocation that is explained above, it copies the user controlled input to that memory area. And, after giving string AAAABBBBCCCCDDDD , one can see that the canaries are corrupted as it has overwritten the 0xdb bytes with the 0x43 and 0x44 which is C and D , then later it calls free(buff1) , which is responsible to validate canary corruption and calls abort() if corrupted with the information like chunk address and size

So, validation of canaries in user-related chunk is validated in free() . But due to the lengthy content, I would like to cover the malloc(3) friends library calls in the upcoming blogs

References:

I have covered all stuff that I have learned while reading the code of malloc(3). In case, if I have forgotten or missed something, please feel free to update me.

If I have explained something wrongly or incorrectly then please feel free to update or correct me, I am also a learner and beginner so that may happen. :)

Again, a huge thanks to all OpenBSD community and especially Otto@ for helping to understand the malloc internals and patiently listening my all queries :). I am still learning and having dicussions on u_short bit[1] variable for bitmap understanding.

Next, I will share about free(3) internals, its ability to detect the memory corruption bugs and also a bit more about bitmap understanding.

Any doubts/questions, feel free to drop mail @bsdb0y or tweet @twitter

Happy Kernel Hacking


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK