Fragmentation; meeting rooms and memory allocators
At QNX we have a problem. We have many meeting rooms but we have many more developers that want to have meetings and discussions in those meeting rooms. While a certain amount of hallway development is always going to happen, the really big whiteboards are all in the meeting rooms … so that is where we have to gather. Our admin folks use Microsoft Exchange, so if you want to book the rooms then you need to play along (or call in a lot of favours!) and book through the Exchange server. An interesting trend I’ve noticed is that every few months we tend to end up with a mysterious meeting room stall. A time where it is not possible to use the system to book any meeting room in the building, but if you wander around you find that most of the rooms are not being used.
Mysterious … and incredibly annoying!
I’ve come to the conclusion that what is happening is that the Exchange meeting rooms are suffering from fragmentation. When the system is cleared, then large groups book re-occuring meetings. Time goes on, and the re-occuring meeting needs to be shifted and bumped, rooms get borrowed (or stolen) and when the meeting organizer changes the room, the room doesn’t officially get free’ed up until all participants clear their bookings of the rooms. Over time things get so fragmented that until everyone clears their calendars you can’t book anything.
Fragmentation is a pain. There are lots of schemes to avoid it in filesystems and in memory allocators, but eventually it comes down to the fact that you need to make arbitrary decisions about how to carve up a resource into pieces to satisfy a particular usage pattern. In the case of operating system memory, this granularity is of page sizes which are generally* 4K chunks.
For most embedded applications 4K of memory is an awfull lot so there is a secondary, application level, allocator that sits behind the malloc/free/realloc calls that applications use to get at the memory. This memory allocator takes the pages provided by the operating system and then further carves them into bands of memory of smaller fixed block sizes as well as a list of “bigger blocks” that are non-uniform in size to satisfy the abnormal requests. A great, more detailed, description of what the QNX memory allocator is doing is in the QNX IDE User’s Guide in the Memory Analysis section.
If you snoop the header file then you can see the
struct malloc_stats structure that can provide you with information about how your system is using application allocated memory. You will need to include the _malloc_stats external global variable to get at the numbers.
While not a direct measure of fragmentation, these statistics give you some insight into the overhead that the memory allocator is taking and how much memory is being handed out to your application (and thus shows up in a pidin mem listing as application heap) but may not be used directly. Note however that these statistics don’t tell the whole story and in the allocator it is possible for “big blocks” to be borrowed for “small block” use but for now we’ll pretend that those values tell you most of what you need to know to get started.
So now that you can see a bit of detail of how the application allocator is carving up and managing memory … what can you do when you find out that the allocator’s decision about small blocks or big blocks isn’t the right one for you but you don’t want to write your own memory allocator? We’ll talk about that kind of customization (and how you know you need to do it) in my next post!
In the meantime, I need to go and find out which meeting room is suffering from fragmentation and grab it for a meeting!
* I say generally here because with the introduction of large page support in new versions of Neutrino, a 4K page may be the general page size, but it isn’t the only page size =;-)