The Story of O

In addition to my day job, I run a small business writing image processing software. Sales have been pretty shitty for the past few months, so I've kindof lost interest in writing new stuff for it. It's no fun writing new features when you don't expect anyone to care - I work much better when people ask for things. But over the Christmas break, I found a research paper that got me interested in programming again; it describes a magic beast: an O(1) Median Filter.

What?

Come with me. I will tell you.

The median filter. Maybe you've used one in PhotoShop. It's fun.

This is how it it works: For every pixel in your input image, look at the values of the 3x3 square of pixels that surround it:

... ... ... ... ...
... 14 3 7 ...
... 3 7 17 ...
... 14 13 77 ...
... ... ... ... ...

(Our target pixel is the 7 in the middle)

Find the median value: 3, 3, 7, 7, 13, 14, 14, 17, 77. The median (the middle value after you've sorted the list of numbers) is 13. So, set the corresponding pixel in your output image to 13, and repeat the process for all the other pixels in the image. Very simple. And the result is a blurred version of the original.

Instead of 3x3, you can use 5x5, or 9x9 or 25x25, or any number, really - though odd numbers and square regions are the most common. The number of pixels you use is known as the "size" of the filter (also the "kernel size"). As you increase the size of the filter, the blur effect gets stronger and stronger because adjacent output pixels start sharing more and more input pixels: with a 3x3 filter, two adjacent output pixels share 6 of 9 input pixels:

If this was our first pixel:

... ... ... ... ...
... 14 3 7 ...
... 3 7 17 ...
... 14 13 77 ...
... ... ... ... ...

... and we moved horizontally to the next pixel...

... ... ... ... ...
... ... 3 7 12
... ... 7 17 4
... ... 13 77 2
... ... ... ... ...

... then our new center is the 17, and the 3,7,7,13,17 and 77 are used in each.

And if our filter was 5x5, we'd share 20 of 25; and 600 of 625 with a 25x25. With so much overlap in the data, the median tends to stay the same.

But the image is not just blurred, it's filtered. See that '77'? If the rest of the pixels in the image have low-ish values, that 77 will never show up in the output, no matter what filter size. The 77 will always end up on the high end of the sorted values and will never be the median itself. So, rather than just blurring the image, what the median filter is really good at is removing noise (either bright or dark) from photographic images. And, even better, you can tweak it subtly to get things like PhotoShop's "Dust And Scratches" filter, which only pays attention to the noise, and tries not to touch the image itself.

So that's how it works.

Now think about the calculations required. With a 3x3 filter, you have to fetch and sort 9 pixels. With a 5x5, you fetch and sort 25. With a 7x7: 49. With a 25x25: 625. With a 101x101: 10,201. Etc.. The amount of work you have to do is directly tied to the size of the filter. More accurately, the work you have to do increases with the square of the size of the filter. So, processing a 7x7 filter requires more than 5 times the amount of work that a 3x3 filter does (49 pixels vs 9 pixels). And a 25x25 filter requires almost 70 times the amount of work that a 3x3 filter does. Assume your computer takes one second to process a 3x3 filter - you could probably sit through that without complaining - but a 25x25 filter will take 70 seconds. That gets tedious.

Note that we're ignoring the fact that the size of the image itself is clearly a factor in how long the process takes. Larger images will take longer than smaller images - there are simply more pixels to process. But we ignore image size, because when writing a filter, the image size is out of our control; a filter is expected to be used on images of all shapes and sizes, not the other way around.

Now, let's get our O face on.

Because we are smart, we can look at processes and classify them according to how each process will behave as we change the size of its input. We just did that with the median filter: by looking at the number of calculations required for various filter sizes, we determined that its performance is based on the square of the size of the filter. Or rather, its complexity is based on the square of the size of the input. We just assume performance is related to the complexity. We could probably come up with a filter with performance tied to the cube of the size of the input, if we wanted, and it would have cubic complexity. And if we did something simple, like multiplying every pixel in the image by an arbitrary value, we'd say that process had "constant" complexity: it will always take the same amount of time, no matter what value we give it. Doing "X x 12" a thousand times is just as complex as doing "X x 4" a thousand times.

And because saying "performance is based on the square of the size of the input", takes too long we just say the process has "n-squared" or "quadratic" complexity. And we write it like this: "O(n2)". Which is shorthand for "the complexity of the process is on the order of n squared". A constant complexity process is "O(1)"; a process with cubic complexity is "O(n3)", etc.. You can have O(log n) processes where the time increases with the logarithm of the size of the input, or O(cn) for exponential complexity, or O(n log n), or O(n!), etc. etc.. This is called "Big O Notation".

Great. So, that's a quick introduction to complexity theory. It can get a lot more complex, but I won't.

Anyway, up above, I said found a paper online that describes a way to do a median filter in constant time (S. Perreault and P. Hébert, "Median Filtering in Constant Time"). If you've made it this far, I assume you can see why that would be interesting. Yes?

Clearly the straightforward (a.k.a. "brute force") way of doing a median filter gets unacceptably slow, when the filter size increases. And luckily, there are O(n) methods out there to do this, which greatly speed things up - PhotoShop uses one of these, and so does my median filter. But these two guys found a way to do median filtering where a 2001x2001 filter runs exactly as fast as a 3x3 filter - in the traditional way, that would take 444,889 times as long (which is a little over 5 days, in the example above). In their scheme, it doesn't matter what size filter you ask for, the processing takes the same amount of time. Before, huge filter sizes were prohibitively time-consuming. Now, with their scheme, filter size is irrelevant to processing time. Very cool.

Exactly how they did this is fairly complex, but it is related to the fact that as you move from one pixel to the next you end up looking at many of the same pixels you just looked at, again - and you've already sorted them and the median is probably close to where it was for the last pixel. So, you don't have really to go back and re-visit and re-sort all those source pixels each time, you just maintain some summary information about what you've already seen, and only pay attention to the relatively few pixels that are new, or newly-unused.

And that's why I didn't leave the house for three days.

12 thoughts on “The Story of O

  1. Rob Caldecott

    Version 6 is pretty good. The whole thing went quiet for a few years but it’s active again.

    Fascinating post BTW Cleek.

  2. Rob Caldecott

    Hope my CxImage comment didn’t piss you off mate. With free libraries like this (and FreeImage), it can’t be easy rolling your own toolkit.

  3. cleek

    nah.

    IMO, GDI+ is the one that really killed the low-price image toolkit market. we could compete with the free amateur toolkits by offering support, better performance, more features, stability, etc.. which does your boss like better: a free but complex toolkit you found on the web which has no support, or one that costs him less than an hour of your time and comes with free support and updates ? that’s easy. but you can’t compete with something that’s free and has Microsoft’s name on it.

    our stuff kills GDI+ in terms of advanced image processing functions and straight up performance. but most people don’t care about that. they just want to read and resize JPGs.

  4. russell

    That’s cool.

    This is good stuff. If you are ever interested in busting a move to the Boston area, lemme know. I used to work in some shops with some heavy graphic aspects (weather visualization) and am still in touch with folks doing that and also medical imaging.

    Pretty good biotech industry up here, too.

    The weather sucks for about half the year, but you can’t have everything.

    Just putting it out there.

    Thanks –

  5. cleek

    we love Boston. my wife goes up there for work quite often, actually. her company has a site in Cambridge. the weather doesn’t bother me – and i like a good gray, sloppy winter.

    we’ve looked into moving up there a couple of times, but it’s just so much more expensive than NC. housing is a good 50% higher in the Boston area than it is in Raleigh.

  6. russell

    we’ve looked into moving up there a couple of times, but it’s just so much more expensive than NC. housing is a good 50% higher in the Boston area than it is in Raleigh.

    Tell me about it. :(

    In any case, if you’re ever interested let me know, I’d be happy to make a couple of introductions.

  7. Ugh

    and i like a good gray, sloppy winter

    Not in Boston you won’t (though my 15 months living there were clouded by not being able to find a job, so I may be slightly biased against; better restaurants than DC though).

Comments are closed.