泛型和非泛型方法的性能差异

Question

为了示例的缘故，假设我们想要使用不同类型的矩阵来处理线性代数。我们有一个自定义矩阵 class 实现了 :

interface IMatrix
{
    double this[int i, int j] { get; set; }
    int Size { get; }
}

我想实现矩阵乘法。我的印象是这两种方法：

static void Multiply<TMatrix>(TMatrix a, TMatrix b, TMatrix result) where TMatrix : IMatrix

和

static void Multiply(Matrix a, Matrix b, Matrix result)

（当然有类似的实现）将在内部产生完全相同的 IL，因此具有相同的性能。事实并非如此：第一个比第二个慢四倍。查看IL，似乎通用的类似于通过接口调用：

static void Multiply(IMatrix a, IMatrix b, IMatrix result)

我错过了什么吗？有什么方法可以使用泛型获得与直接调用相同的性能？

已安装 Framework 4.8，目标 Framework：4.7.2（也使用 .Net Core 3 进行了测试）

方法实现：

static void Multiply(Matrix a, Matrix b, Matrix result)
{
    for (int i = 0; i < a.Size; i++)
    {
        for (int j = 0; j < a.Size; j++)
        {
            double temp = 0;
            for (int k = 0; k < a.Size; k++)
            {
                temp += a[i, k] * b[k, j];
            }
            result[i, j] = temp;
        }
    }
}

Minimal reproductible example

Answer 1

.NET 只会为所有引用类型生成一次泛型方法代码。并且该代码必须通过 IMatrix 接口调用，因为各种实现类型可能使用不同的方法实现该接口。所以它只是一个接口调用。

但是，如果您将 Matrix 设为 struct 而不是 class，JITter 将生成泛型方法的特定类型实现，并且可以优化接口调用。

泛型和非泛型方法的性能差异

Performance difference between generic and non generic method

.net

c#

generics

.net-core

.net-4.7.2