Sunday, June 16, 2013

snprintf() is not Foolproof

Something should have been done about the symbol-to-letter ratio of this post.

Presenting snprintf():

int snprintf(char * restrict str, size_t size, const char * restrict format, ...);

For those who are not familiar with snprintf(), it is used to create formatted strings in a buffer, just like the similarly named sprintf(). However, the man page for snprintf() and sprintf() indicates some significant differences:
  • snprintf()... will write at most [size]-1 of the characters printed into the output string (the [size] th character then gets the terminating '\0'); if the return value is greater than or equal to the [size] argument, the string was too short and some of the printed characters were discarded. The output is always null-terminated.”
  • snprintf() returns “the number of characters that would have been printed if the [size] were unlimited... not including the final '\0'
What all this means is that snprintf() acts like a somewhat safer version of sprintf() from the point of view of preventing buffer overflows. I’ve used it a lot in sections of code where performance is not a major concern, but buffer overflows are (in fact buffer overflows are always a major concern.)

But “safer” is one thing, and “foolproof” is another.

I was reminded of the difference the other day when creating some unit tests for code that uses snprintf(). For no particular reason other than to get the test to compile, I just picked some random values and ran it, and was surprised to see a section of code something like this...

    char someBuf[1];
    int nBufUsed = 2;
    int nBufAvailable = 1;
    if(snprintf(someBuf + nBufUsed, nBufAvailable - nBufUsed, "Hello") !=
       strlen(someBuf + nBufUsed))
    {
        return -1;
    }
    return 0;

...happily run without complaining at all, when I was (incorrectly) expecting it to catch the fact that "Hello" was being written to a 1-byte buffer. (That’s obviously not the exact code, and for other reasons those catastrophically buggy values would not have cropped up in the real program, but it illustrates the point.)

So what happened? The size_t size parameter to snprintf() is unsigned, so if you’ve got a bug that causes a negative value to be passed in (in this case nBufUsed was greater than nBufAvailable), it will be interpreted as a very large positive value for the buffer size. This can result in the very buffer overflow you were trying to avoid by using snprintf().

You can never be too careful.

PS: I'm not in the habit of passing negative size values around except in testing.

Summary: snprintf() is helpful, but not foolproof.

Score: Either -1 or 18446744073709551615.

Moral: Mind that size_t size parameter to snprintf(), for it is not immune to “garbage in, garbage out.”

Labels: ,

 


 

Sunday, June 9, 2013

The Case of the Noisy Codec

Load this up with jokes on the re-write!

While I was working on a song crafting application called SongDasher_Light (https://github.com/wilsonmichaelpatrick/SongDasher_Light) for the iPhone and iPad, I ran across a very intermittent, but absolutely horrible, problem where bursts of noise were being introduced into the AAC encoding of the audio. I really hated it for being intermittent, and I really hated it it for being horrible. You do not want to hear this:



The tall part toward the right is 2048 bytes of very loud noise.

After a great deal of wrangling, I narrowed things down to a difference between the hardware codec and the software codec. Given a certain sets of legitimate (i.e. not contrived) input data, converting audio from PCM to AAC seems to result in junk data in the output. In this case the input was an actual drum beat, not just arbitrary 1’s and 0’s. The problem also occurred with other kinds of legitimate input.

In order to rule out any weird memory corruption or other defect in my project as the cause, I created a separate demonstration program, whose only purpose is to convert the bug-demonstration data to AAC; it is available at https://github.com/wilsonmichaelpatrick/HWCodecGlitch. (Note: This program only demonstrates the problem on a real device, because the hardware codec is not available in the simulator.) When the software codec (kAppleSoftwareAudioCodecManufacturer) is used in the conversion, the AAC output sounds nice. When the hardware codec is used, as it is by default on a real device, the burst of noise is present.

That’s about as much as I was able find out about the issue, which isn’t perfectly convincing, but convincing enough: using the hardware codec with certain inputs results in noise, even in a very simple, stand-alone program. To fix the problem, I just use the software codec even when the hardware codec is available, and I have not seen it since.

I was glad that my own code didn’t seem to be the cause, but it was a lot of work to identify and find a way around this problem.

PS: Turn your volume way down if you plan on listening to the burst of noise; it is drastic.

Summary: I’m convinced enough that it was the hardware codec making noise, not my code.

Score: Software codec 1, hardware codec 0.

Moral: Ruling out your own code as the cause of a problem can be somewhat, but not adequately, satisfying.

Labels: , , , , , , ,

 


 

Sunday, June 2, 2013

Eventual Optimization

You don’t know until you measure!

Let me just throw this out there:

It’s not enough to optimize code in the areas that obviously require optimization, if it is still hobbled by things that aren’t expected to be consuming resources.

As such, I think it’s important to profile running code; it’s very difficult to know what’s using up your resources until you measure. Without actual measurements, there’s more risk of missing significant performance issues by honing all of the smart stuff, and ignoring all of the silly stuff.

Silly stuff like the methods highlighted below:


This is a portion of the results from Xcode profiling SongDasher_Light (https://github.com/wilsonmichaelpatrick/SongDasher_Light) recording a two-minute audio track on an iPhone. Drilling down from the highest-CPU subroutine to some code that I’ve written (i.e. code with high CPU-consumption that I can actually do something about), it turns out GetDBLevel_16Bit is the “winner.” What does GetDBLevel_16Bit do? It makes the little level meter light up when someone makes a noise into the microphone. That’s it. (As far as I know CoreAudio only calculates decibels for output, not input, which is why I’m doing this the hard way here.) It was very surprising to see something so non-essential taking up 25.2% of the total CPU for the whole program. It’s somewhat less surprising when you look at the code in this method, though:

    for(int i = 0; i < nNumFrames; i++)
    {
        (*pflInputDBCur) = flLowpassAlpha * (abs(arrSamples[i])) + 
        (1.0 - flLowpassAlpha) * (*pflInputDBPrev);
        (*pflInputDBPrev) = (*pflInputDBCur);
        
        // DB is 20 log10
        Float32 flCurDB;
        flCurDB = 20.0 * log10(*pflInputDBCur);
        
        if((flCurDB == flCurDB) && (flCurDB != -DBL_MAX))
        {
            if(flCurDB > (*pflInputDB))
                (*pflInputDB) = flCurDB;
        }
    }

Basically, this code is correct. (Please let me know if it’s not!) But it’s a pretty expensive computation to be doing on every single sample of input while recording. (I must have really been avoiding premature optimization on the day I wrote that!) All I’m trying to do here is get a level meter to pass an “eyeball” test by jumping to an appropriate height when it hears a noise. 25.2% of total CPU! What if it just looked at, say, every 8th sample by setting the loop increment to 8, like this:

    for(int i = 0; i < nNumFrames; (i += nIncrement))
    {
        // etc...
    }

Would that help? In fact it helps a lot, which I’ll get to shortly. The point here is that I would never have known to optimize this if the profiler had not put it toward the top of the list. After taking great pains to pre-calculate all the little x-axis crosses where the waveforms can be joined without “popping,” and pre-calculate the precise drum-machine trigger times, and all the other corner-cutting I did to keep things quick, I almost gave it all back (and more) with the input level meter.

After GetDBLevel_16Bit, there’s WriteBufferToCaptureFile, which is used during recording to save the sound to a file. Why is that so high in the list? As it turns out, it was calling this method:

extern OSStatus
ExtAudioFileWrite(ExtAudioFileRef inExtAudioFile,
UInt32 inNumberFrames, const AudioBufferList * ioData)

Instead of this method:

extern OSStatus
ExtAudioFileWriteAsync(ExtAudioFileRef inExtAudioFile,
UInt32 inNumberFrames, const AudioBufferList * ioData)

ExtAudioFileWrite writes to the file right away, and is not meant for time-critical operations like capturing audio. ExtAudioFileWriteAsync batches writes to avoid interrupting the thread it’s called on. The insidious thing is that both of them produce the same result (the audio correctly saved to a file) but have a significant difference in performance. I might never have known I’d accidentally put the wrong one in there if I hadn’t measured the performance.

Having addressed these issues with GetDBLevel_16Bit and WriteBufferToCaptureFile, I ran the profiler again, and got the following results:



Ballpark figures for these improvements, keeping in mind that the two test runs are not exactly identical, and whatever other noise is present from running on the actual device:
  • GetDBLevel_16Bit has gone from 7891.0ms to 961.0ms, making it about 8.2 times faster,
  • WriteBufferToCaptureFile has gone from 2676.0ms to 233.0ms, making it about 11.5 times faster.
Put together, the CPU used by the audio thread as a whole went from 20995.0ms to 10109.0ms, making it about 2.1 times faster. That’s a pretty huge gain for two small fixes; in fact it helped wring out the last bits of “glitchiness” I was seeing running on older devices. And to come back to my original point, the fixes were in code I wouldn’t have thought needed optimizing until I measured.

Summary: After making my best effort to keep the audio processing fast, I found two things I would never have guessed were taking up a large percentage of total processing time, and would not have found them without profiling the running code.

Score: About 2.1 times faster.

Moral: Profiling can help identify performance issues in unexpected places.

Labels: , , , , , ,

 


 

Sunday, May 26, 2013

The Reentrant Observer Pitfall

This may be your only warning!

The Observer pattern “is a software design pattern in which an object, called the subject, maintains a list of its dependents, called Observers, and notifies them automatically of any state changes, usually by calling one of their methods” (http://en.wikipedia.org/wiki/Observer_pattern). It’s a pattern I use in SongDasher_Light (https://github.com/wilsonmichaelpatrick/SongDasher_Light) to update the various user-interface views.

Here’s the snag: what happens when one of the Observers, in the course of responding to a state change, does something that results in another state change? Consider this case where two Observers, Observer_1 and Observer_2, respond to two different state changes, State_Change_A and State_Change_B:
  • For whatever reason, the Subject has to tell its Observers about State_Change_A;
  • Observer_1 is told about State_Change_A, and does something that results in some other state change;
    • Now the Subject has to tell its Observers about State_Change_B;
    • Observer_1 is told about State_Change_B, and does whatever;
    • Observer_2 is told about State_Change_B, and does whatever;
  • Observer_2 is told about State_Change_A.
See what happened to Observer_2 there? Observer_2 is told about State_Change_B before it’s told about State_Change_A. This is because the Subject has experienced “reentrancy” - it has to do the State_Change_B notifications in the middle of doing the State_Change_A notifications. As a result, some of the Observers (i.e. Observer_1 and anyone in the notification list prior to it) see State_Change_A, followed by State_Change_B.  The rest of the Observers see the reverse.

This gives the distinct impression to Observer_2 that State_Change_B happened before State_Change_A. That’s not actually true. So it’s probably something to try to avoid.

Now, the astute reader can probably see that I’m assuming everything here is “synchronous.” That is, when a state change occurs, all Observers are informed right away that the state change occurs. This is the reason that the notifications for State_Change_B are “nested” within the notifications for State_Change_A. I’m assuming “synchronous” here because it’s conceptually simple, and probably the approach that occurs to people first when thinking about the Observer pattern. Also, it’s the way I implemented the Observer pattern in SongDasher_Light. Why not? Keep it simple.

Up next: the pitfall is realized, attempts at mitigation, and dealing with bad consequences.


Mitigating Against the Reentrant Observer Pitfall

An asynchronous approach to state change notifications is one way to mitigate against this reentrancy; put the notifications for State_Change_B into a queue, after all of the State_Change_A notifications, instead of sending them immediately. Then the notifications arrive at the Observers after the state change has happened, perhaps much later, but at least they’re in the right order.

Another way is to simply not allow the reentrancy. Specifically, in every method that might result in state change notifications, have a little check to see if we’re already in a state-changing method further up the stack. It sure seems simpler than putting in a more complicated notification mechanism. So in SongDasher_Light, I took the easy way out and decided to just disallow reentrancy. Now, here’s why the word “pitfall” is in the title (which someone, someplace once defined as “a trap you walk into even though you know it’s there”):

It’s pretty difficult to know all the paths through your code that might become reentrant in event-driven user interface programming.

SongDasher_Light fell into this pit because of all of the rigamarole involved in editing text fields. After I put in some reentrancy blocks to catch any inadvertent reentrancy, I found it was happening all the time when, for instance, a user starts editing the name of a region in the song, then taps a little square in the drum pattern view. The drum pattern tap results in a call into the Subject, which then informs all the Observers that the drum pattern has changed, which causes the text field being edited to lose focus, which causes a reentrant call into the Subject to complete the name change. So the name change notifications are nested into the drum change notifications, and my easy-way-out blocking of all reentrancy didn’t work in this case.

So I took the easy way out again and just allowed reentrancy in the specific case of editing the name of a section, having established that getting an out-of-order name update is not a big deal.

Summary: Even knowing about the reentrancy pitfall, and that its consequences are bad, and taking steps to avoid them, I ran directly into some bad consequences anyway, and swept them under the rug by deciding it’s not so bad, this time.

Score: Pragmatism 1, Everything Else 0.

Moral: The highly popular Observer pattern is trickier than it looks.

Labels: ,

 


 

Current Posts *  snprintf() is not Foolproof * The Case of the Noisy Codec * Eventual Optimization * The Reentrant Observer Pitfall *  May 2013 * June 2013 *  Subscribe to Posts [Atom]

profile for wilsonmichaelpatrick on Stack Exchange, a network of free, community-driven Q&A sites

© 2013 wilsonmichaelpatrick






Powered by Blogger