Fast log2

Michael Herf

I think that it's really good to have a fast log base 2 approximation.

So I made one.

I use these functions for mipmapping, accumulation buffers, all sorts of things where you need to do a log per pixel. If you call the standard C library, you'll generally end up spending well over 100 cycles doing logs. Especially when you only need integer precision -- if you do:

  (int)(log(x) / log(2))
you'll spend about 200 cycles.

I have a couple of functions that run in "about 4" cycles, which means that it depends a lot on what register use is like when you call them. In well-pipelined situations, you might get 1 cycle.

If you need small amounts fractional precision, you can raise the input to a power before passing it in (using multiplies, not pow!) and then scale the result it gives you.

There are two implementations below, one using the FPU in a portable way (assuming you're on an IEEE floating-point machine.) The second implementation uses the x86's "bsr" instruction in a very non-portable way.

However, most CPUs do have this kind of call at the assembly level. But if you don't want to rewrite the function in assembly on your platform, I recommend simply writing an integer version using the floating-point version of the function. e.g.,

typedef unsigned long uint32;
typedef long int32;

static inline int32 ilog2(uint32 x) {
	return ilog2((float)x);

// integer log2 of a float
static inline int32 ilog2(float x)
	uint32 ix = (uint32&)x;
	uint32 exp = (ix >> 23) & 0xFF;
	int32 log2 = int32(exp) - 127;

	return log2;

static inline int32 ilog2_x86(uint32 x) 
	int32 retval;
	__asm {
		bsr eax, x
		mov retval, eax
	return retval;