作者:CL_LC的小屋花_344 | 来源:互联网 | 2024-12-16 14:16
Imreadingthisdocument:http:software.intel.comen-usarticlesinteractive-ray-tracing我正在阅读这个文
I'm reading this document: http://software.intel.com/en-us/articles/interactive-ray-tracing
我正在阅读这个文档:http://software.intel.com/en- us/articles/interactiveray - trace
and I stumbled upon these three lines of code:
我偶然发现了这三行代码:
The SIMD version is already quite a bit faster, but we can do better. Intel has added a fast 1/sqrt(x) function to the SSE2 instruction set. The only drawback is that its precision is limited. We need the precision, so we refine it using Newton-Rhapson:
SIMD版本已经快了很多,但是我们可以做得更好。英特尔在SSE2指令集中增加了一个快速的1/sqrt(x)函数,唯一的缺点是它的精度有限。我们需要精确,所以我们用牛顿-瑞普森来改进它:
__m128 nr = _mm_rsqrt_ps( x );
__m128 muls = _mm_mul_ps( _mm_mul_ps( x, nr ), nr );
result = _mm_mul_ps( _mm_mul_ps( half, nr ), _mm_sub_ps( three, muls ) );
This code assumes the existence of a __m128 variable named 'half' (four times 0.5f) and a variable 'three' (four times 3.0f).
这段代码假设存在一个名为“half”的__m128变量(4乘以0.5f)和一个变量“three”(4乘以3.0f)。
I know how to use Newton Raphson to calculate a function's zero and I know how to use it to calculate the square root of a number but I just can't see how this code performs it.
我知道如何用牛顿法来计算函数的零我知道如何用它来计算一个数字的平方根但我不知道这段代码是如何执行的。
Can someone explain it to me please?
谁能给我解释一下吗?
2 个解决方案