Quake 3 was probably the most famous first-person shooter back in 1999. It had fascinating graphics and very high-responsiveness which is the result of a performance optimization and high-quality code written by id
Software team. One of the most famous optimization tricks is the function that computes the approximate of inverse (reciprocal) square root through some clever bit hacking. This function is the subject of investigations by mathematicians and programmers even today. In this presentation we try to understand how it works and we also try to find the author.
12. Fast Approximate
Inverse Square Root
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y;
// evil floating
//point bit level hacking
// what the f☀✿k?
i
//
= 0x5f3759df - ( i >> 1 );
y
y
y
= * ( float * ) &i;
= y * ( threehalfs - ( x2 * y * y ) );
= y * ( threehalfs - ( x2 * y * y ) );
return y;
}
// 1st iteration
// 2nd iteration,
//this can be removed
13. float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
(1)
(2)
(1)
(3)
//
x2
y
i
i
=
=
=
=
number * 0.5F;
number;
* ( long * ) &y;
0x5f3759df - ( i >> 1 );
y
y
y
= * ( float * ) &i;
= y * ( threehalfs - ( x2 * y * y ) );
= y * ( threehalfs - ( x2 * y * y ) );
// evil floating point bit level hacking
// what the f☀✿k?
// 1st iteration
// 2nd iteration, this can be removed
return y;
}
(1)Interpret float as integer
(2)Good initial guess with magic number 0x5f3759df
(3)One iteration of Newton’s approximation
14. (1)Interpret float as integer
32-bit float:
0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
E
M
0.15625 which is 1.01x2-3 in binary
E=-3+127=124 or 01111100 in binary
M=.01
20. (2)Magic Number: 0x5f3759df
•Gives a good initial guess.
•Minimizes the relative error.
•Trying to find a better number that minimizes
the error of initial guess we come up with:
0x5f37642f
[4]
21. (2)Magic Number: 0x5f3759df
•Gives a good initial guess.
•Minimizes the relative error.
•Trying to find a better number that minimizes
the error of initial guess we come up with:
0x5f37642f
Did we find a better magical number? ;)
[4]
22. (3)One iteration of Newton’s method
Newton’s method:
Given a suitable approximation yn to the root of f(y),
gives a better one yn+1 using
root
23. (3)One iteration of Newton’s method
Newton’s method:
Given a suitable approximation yn to the root of f(y),
gives a better one yn+1 using
In our case:
y
= y * ( 1.5f - ( 0.5f * x * y * y ) );
24. (3)One iteration of Newton’s method
After one iteration of Newton’s method
our magic number 0x5f37642f gives worse approximation
than the original magic number 0x5f3759df !!! [4]
Open Question:
How was the original magic number derived?
25. Open Question:
How was the original magic number 0x5f3759df derived?
•Lomont in 2003 numerically found a slightly better
magic number 0x5f375a86
[4]
•Robertson in 2012 analytically found the same
better magic number 0x5f375a86
[3]
26. How good?
Max relative error: 0.177%
[3]
With the 2nd iteration of Newton’s method: 0.00047% [3]
27. In 1999: ???
How fast?
Today: on CPUs 3-4 times faster
With the 2nd iteration of Newton’s method: 2-2.5 faster
[3]
29. Who?
John Carmack?
Lead Programmer of Quake, Doom,
Wolfenstein 3D
[8]
Michael Abrash?
Author of:
Zen of Assembly Language
Zen of Graphics Programming
30. Who?
John Carmack?
Lead Programmer of Quake, Doom,
Wolfenstein 3D
“...Not me, and I don’t think it is Michael (Abrash).
Terje Mathison perhaps?...”
Michael Abrash?
Author of:
Zen of Assembly Language
Zen of Graphics Programming
[8]
31. Who?
Terje Mathisen?
Assembly language optimization for x86
microprocessors.
“... I wrote fast & accurate invssqrt()... for a
computational fluid chemistry problem...
...The code is not the same as I wrote...”
[8]
33. Who?
Gary Tarolli?
Co-founder of 3dfx (predecessor of Nvidia)
“It did pass by my keyboard many many years ago, I
may have tweaked the hex constant a bit or so, but
other than that I can’t take credit for it, except that
I used it a lot and probably contributed to its
popularity and longevity. “
[8]
34. Who?
Gary Tarolli?
Co-founder of 3dfx (predecessor of Nvidia)
“It did pass by my keyboard many many years ago, I
may have tweaked the hex constant a bit or so, but
other than that I can’t take credit for it, except that
I used it a lot and probably contributed to its
popularity and longevity. “
[8]
This hack is older than 1990!!!
35. Who?
Cleve Moler inspiration
Founder of the first MATLAB,
one of the founders of MathWorks,
is currently a Chief Mathematician there.
Greg Walsch author (most probably)
Being working on Internet and distributed
computing technologies since before it was even
the Internet, and helping to engineer the first
WYSIWYG word processor at Xerox PARC
while at Stanford University
[9]
[9]
36. Who?
Inspired by Cleve Moler from the code written
by Velvel Kahan and K.C. Ng at Berkeley around
1986!!!
http://www.netlib.org/fdlibm/e_sqrt.c
[10]
37. Finally
It is Fast:
3-4 faster than the straightforward code
It is Good:
0.17% maximum relative error
It can be Improved
Dates back in 1986
39. Some literature here
Quake 1,3 Architecture
1)
Fabien Sanglard, Quake 3 source code review. 2012 http://fabiensanglard.net/quake3/
2)
Michael Abrash, Ramblings in Realtime http://www.bluesnews.com/abrash/
Inverse Square Root
3)
Matthew Robertson, A Brief History of InvSqrt. 2012 Bachelor’s Thesis. Brunswick, Germany
4)
Chris Lomont, Fast Inverse Square root, Indiana: Purdue University, 2003
5)
Jim Blinn, Floating-point tricks, IEEE Comp. Graphics and Applications 17, no 4, 1997
6)
David Elbery, Fast Inverse square root (Revisited), Geometric Tools, LLC, 2010
7)
Charles McEniry, The Mathematics Behind the Fast Inverse Square Root Function Code, 2007
Investigation of the Authorship
8)
Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() 2006 http://www.beyond3d.com/content/articles/8/
9)
Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() - Part Two 2007 http://www.beyond3d.com/content/articles/15/
10)
http://blogs.mathworks.com/cleve/2012/06/19/symplectic-spacewar/#comment-13
Additional
11)
http://en.wikipedia.org/wiki/Fast_inverse_square_root
12)
https://github.com/id-Software/Quake-III-Arena