使用 DirectXMath 和 D3DXMath

Working with DirectXMath and D3DXMath

在 D3DXMath 中,我们能够乘法、加法或减法甚至除法向量类型,它们是 D3DXVECTOR2、D3DXVECTOR3、D3DXVECTOR4 结构...... 现在在 DirectXMath 化身中我们有 XMFLOAT2、XMFLOAT3、XMFLOAT4 和 XMVECTOR。如果我想做任何数学运算,我必须以任何一种方式从 XMFLOAT 转换为 XMVECTOR Visual Studio 抛出错误“没有 用户定义的转换”。这是为什么呢?事实上,在新版本(Windows 8.1, 10)中,DirectX 数学库向量运算略有改变。我做错了什么...... ....?!

P.S。那么对于矩阵还有另一个问题,但现在让我们只讨论向量。这些变化正在推动第三方开发人员创建他们自己的数学库,他们已经做到了......:)

这个其实在MSDN的DirectXMath Programmer's Guide里面有详细的解释:

The XMVECTOR and XMMATRIX types are the work horses for the DirectXMath Library. Every operation consumes or produces data of these types. Working with them is key to using the library. However, since DirectXMath makes use of the SIMD instruction sets, these data types are subject to a number of restrictions. It is critical that you understand these restrictions if you want to make good use of the DirectXMath functions.

You should think of XMVECTOR as a proxy for a SIMD hardware register, and XMMATRIX as a proxy for a logical grouping of four SIMD hardware registers. These types are annotated to indicate they require 16-byte alignment to work correctly. The compiler will automatically place them correctly on the stack when they are used as a local variable, or place them in the data segment when they are used as a global variable. With proper conventions, they can also be passed safely as parameters to a function (see Calling Conventions for details).

Allocations from the heap, however, are more complicated. As such, you need to be careful whenever you use either XMVECTOR or XMMATRIX as a member of a class or structure to be allocated from the heap. On Windows x64, all heap allocations are 16-byte aligned, but for Windows x86, they are only 8-byte aligned. There are options for allocating structures from the heap with 16-byte alignment (see Properly Align Allocations). For C++ programs, you can use operator new/delete/new[]/delete[] overloads (either globally or class-specific) to enforce optimal alignment if desired.

However, often it is easier and more compact to avoid using XMVECTOR or XMMATRIX directly in a class or structure. Instead, make use of the XMFLOAT3, XMFLOAT4, XMFLOAT4X3, XMFLOAT4X4, and so on, as members of your structure. Further, you can use the Vector Loading and Vector Storage functions to move the data efficiently into XMVECTOR or XMMATRIX local variables, perform computations, and store the results. There are also streaming functions (XMVector3TransformStream, XMVector4TransformStream, and so on) that efficiently operate directly on arrays of these data types.

根据设计,DirectXMath 鼓励您编写高效、SIMD 友好的代码。加载或存储矢量是昂贵的,因此您应该尝试在加载数据的 'stream' 模型中工作,在寄存器中大量使用它,然后写入结果。

即是说,我完全理解对于一般的 SIMD 数学或 DirectX 新手来说,该用法有点复杂,即使对于专业开发人员来说也有点冗长。这就是为什么我还为 DirectXMath 编写了 SimpleMath 包装器,这使得它更像您使用 XNA Game Studio 寻找的经典数学库,例如 Vector2Vector3Matrix 类 和 'C++ magic' 覆盖了所有显式加载和存储。 SimpleMath 类型与 DirectXMath 巧妙地互操作,因此您可以根据需要混合搭配。

另见 this blog post and GitHub

DirectXMath is purposely an 'inline' library meaning in optimized code you shouldn't be passing variables much and instead just computing the value inside your larger function. The D3DXMath library in the deprecated D3DX9, D3DX10, D3DX11 library is more old-school which relies on function-pointer tables and is heavily performance bound by the calling-convention overhead.

These of course represent different engineering trade-offs. D3DXMath was able to do more substitution at runtime of specialized processor code paths, but pays for this flexibility with the calling-convention and indirection overhead. DirectXMath, on the other hand, assumes a SIMD baseline of SSE/SSE2 (or AVX on Xbox One) so you avoid the need for runtime detection or indirection and instead aggressively utilize inlining.