Archive for the ‘Debugging’ Category

Move along, nothing to see here.

As some of you may know, Thomas and I have been working on a port of dtrace, mostly in our spare time. I was intrigued and enthused to see that it had also been ported to FreeBSD and Mac OS X Leopard.

I was dismayed, however, to read this blog entry by one of dtrace’s inventors, Adam Leventhal, on a discovery he had made about the Leopard port.

I won’t repeat the details of it here, but the gist of it is that Apple allows for certain processes to be hidden from any debug/tracing systems. While you may or may not agree on whether that is Truly Evil Behaviour or not, it doesn’t really matter – what it means for sure is that the Leopard dtrace port is severely broken.

When a system wide tracer cannot trace the whole system, it pretty much ceases to be of any use!

You might as well pack up, and go home, folks!

The really sad thing is the way they broke it! They basically disable any probes from running at the same time as a protected process, the prime example being iTunes.

This means, for example, that anything based on the timer tick is useless, since you lose the regular heartbeat, and that any kernel activity that may have been deferred (say disk io) is lost, unnoticed by your dtrace probes!

It really means that you cannot trust any of the data that dtrace is supplying, as it may be at best incomplete, at worst incorrect.

I shake my head, I really do.

Colin

Can you wait a little bit faster?

I was having a conversation with a customer the other day who was very concerned about a pidin listing where he was seeing something like the following:

# pidin 
     pid tid name               prio STATE       Blocked 
... 
       20   1 server_app             255o RECEIVE    1 
       20   1 server_app             10o  RUNNING 
       20   1 server_app             255o RECEIVE    1 
... 
       32   1 client_app              10o REPLY         20 
       32   2 client_app              10f NANOSLEEP 
...

He was concerned because most of his system generally operated at priority 10 and if there were tasks that were at priority 255, then wouldn’t those other tasks be starved out and never get a chance to run!?

Of course if you take a few minutes to reflect on what is really being displayed then peace and calm return quite quickly to the world. 

The threads that are all marked as being high priority are all in blocking states.  That is to say that they are not RUNNING, nor are they wanting to run and in the READY state.  In fact they are all blocked waiting for some other sort of event to occur or non-CPU resource to become available.  While their reported priorities are high they are not going to interfere with the operation of the system … at least not until they transition to a RUNNING or READY state.   In the case of threads which are RECEIVE blocked such as is the case here, unless the communication channels were created specifically to avoid priority inheritance, the receive thread will end up READY/RUNNING at the priority of the sending client.

If that is the case you may wonder, why have pidin report the priority information on blocking states at all if it is just going to cause people this sense of panic and trigger these types of false alarms?

Well the information can be insightfull since in most blocking scenarios, the thread was READY or RUNNING prior to being blocked and as such the priority gives a hint to what they might have been doing.  Also in the case of priority inheritance states (MUTEX, SEND) then you can better see who is causing priorities to be automatically shifted in the system.

In this case, 255 was a curious number to be seeing and we eventually traced it back to their code where they were generating asynchronous pulse events to a server, but the pulses were not being properly initialized and were picking up junk off the stack and setting bad priorities … hence the 255 value. 

QNX Coding Tip of the Day:  Don’t initialize sigevent‘s using the member values, use the SIGEV_* macros!

 Thomas

Grabbing cores

I love debuggers. I don’t know what I would do without having a good capable debugger as part of my development toolbox. The ability to see right into the core operation of a piece of software and quickly get and overview of the whole state of the program beats random printf markers anyday.

Oddly enough, my first memory of using a debugger had nothing to do with traditional debugging. I was a co-op student working on some speech processing software using the now defunct Mac CodeWarrior development tools.

I was trying to figure out how to insert timing measurements into a large piece of algorithmic software to determine how to best optimize it.

“Simple” said one of my colleagues at the time. He fired up the debugger and started stepping through the project:

step step, step ………… step

“That’s the function you should look at” indicating the one with the long step time delay.

A true KISS principle that I’ve used over and over to do basic analysis of where a program is spending its time. Simple and effective when you are considering software where delays and latency is measured in seconds.

The cool thing about working with QNX systems is that you can use the same debugger and techniques on just about any piece of system software. Services, drivers and applications are all outside the kernel so there are very few occasions where you can’t attack a problem using a process level debugger.

Neutrino uses gdb as its main debugger (the IDE drives gdb under the covers) and one of the great facilities that gdb enables is the ability to do post-mortem analysis on crashed binaries using core files.

 % ntox86-gdb my-x86-binary my-x86-binary-corefile

or if you are in the IDE then create a post-mortem debug session and select the binary and the core file as part of the launch configuration.

Capturing core files on a target Neutrino system requires that you run the dumper program. If (when?) your program abnormally terminates, assuming dumper is running, a core file will be generated and a diagnostic message logged. You can then load up the debugger with the core file and get right to the root of the problem.

The dumper program has a bunch of usefull options, but one of the most usefull is the -p option that allows you to generate a core file for any running process without terminating that process.

 % dumper -p process_id -d dump_directory

This is a great way to capture program snapshots for those situations where you can’t attach to the running program due to potential latency
or you just want to get an idea of the program state (ie the program seems to be hung up or burning CPU) and don’t have a local debugger connection.

Once you have the core file, you load up into the debugger the same way as if you had received a core file from a crash.

Simple, easy and totally pain free debugging!

Browsing trumps searching

I was doing some maintenance work (bug fixing) on some code that was using the strncopy(src,dst,n)function to limit the number of bytes that were being copied.  Even though I’ve used this function hundreds of times, I never remember the exact details of what happens  with respect to the length and the terminating null character is it included as part of the length or isn’t it (it isn’t).   As a result I end up taking a trip into our QNX documentation and the C library reference to double check the behaviour.   Here is how the reading went this time for me:

 copy no more than n bytes over from dst to src. … yeah yeah …

then whammo the next line from the docs jumped right out at me:

If the string pointed to by src is shorter than n characters, null characters are appended to the copy in the array pointed to by dst, until n characters in all have been written.

Wow! So strncpy() is always padding out the full size of the buffer with null characters. I don’t know why this hadn’t struck me before, but this time I was looking at some code that was working with a large maximum buffer (4K) but was most often dealing with strings that were only 10 or 20 characters long.  Since this was being done repeatedly, it seemed like a code change was going to be in order, something along the lines of (using the same src,dst and n):

int len = strlen(src) + 1;  

if(len > n) { 
   len = n - 1; 
   dst[len] = ''; 
}   

memcpy(dst, src, len);

This was kind of a drag since it was more code to write, but it was going to be a bit more efficient since we weren’t going to iterate over all 4K of the buffer padding it in with bytes we didn’t need … although we were passing twice through the string. Wanting to double check the order of the src and dst arguments for memcpy I headed back to the docs. I use the HTML index instead of the search so I jumped to the ‘M’ functions. Looking through the list I came across a library function I had never heard of before … memccpy so I thought I’d take a quick read of it:

Copy bytes between buffers until a given byte is found

Cool! That is exactly what I was looking for. The call does an efficient one pass iteration copying only the data bytes until it hit the matching character or it runs out of space to copy to.  I still needed to deal with ensuring the destination buffer was going to be null terminated (as with strncpy() ) but at least now I wasn’t spending any time filling in extra bytes as padding:

if(memcchr(dst, src, '', n) == NULL) { 
dst[n-1] = ''; 
}

So, now I get the security of a bounded string buffer copy and efficiency to boot and I picked up a new standard C API for my toolbox … all because I like to browse the API instead of using the indexed search!  Of course, perhaps at some point this will get even more straightforward if the strlcpy() function becomes part of POSIX/ANSI C standard.

Follow

Get every new post delivered to your Inbox.