为什么对 javascript 字符串排序比对数字排序更快?

Why is it faster to sort javascript strings than numbers?

我正在调查我的先入之见,即在 javascript 中对字符串进行排序比对整数进行排序要慢。这是基于我读过的东西(现在找不到),它似乎是错误的,它指出 javascript 将字符串存储为 Array<Array<int>> 而不仅仅是 Array<int>MDN documentation 似乎与此矛盾:

JavaScript's String type is used to represent textual data. It is a set of "elements" of 16-bit unsigned integer values. Each element in the String occupies a position in the String. The first element is at index 0, the next at index 1, and so on. The length of a String is the number of elements in it.

如果我们将元素(数字或字符串)的 "size" 定义为其文本表示的长度(因此 size = String(x).length 用于数字元素或字符串元素),那么对于相同大小的大数组元素(一个数字和一个字符串),我期望字符串的排序 等于或稍慢 比数组排序,但是当我运行 一个简单的测试(下面的代码),事实证明字符串的排序速度大约是原来的两倍。

我想知道字符串和数字是什么,以及 javascript 如何进行排序,这使得字符串排序比数字排序更快。可能是我理解错了。

结果:

~/sandbox > node strings-vs-ints.js 10000 16
Sorting 10000 numbers of magnitude 10^16
Sorting 10000 strings of length 16
Numbers: 18
Strings: 9
~/sandbox > node strings-vs-ints.js 1000000 16
Sorting 1000000 numbers of magnitude 10^16
Sorting 1000000 strings of length 16
Numbers: 3418
Strings: 1529
~/sandbox > node strings-vs-ints.js 1000000 32
Sorting 1000000 numbers of magnitude 10^32
Sorting 1000000 strings of length 32
Numbers: 3634
Strings: 1474

来源:

"use strict";
const CHARSET = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghjijklmnopqrstuvwxyz0123456789:.";

function generateString(L) {
    const chars = [];
    while(chars.length < L) {
        chars.push(CHARSET[Math.floor(Math.random() * CHARSET.length)]);
    }
    return chars.join("");
}

function generateNumber(L) {
    return Math.floor(Math.random() * Math.pow(10, (L - 1))) + Math.pow(10, L - 1);
}

function generateList(generator, L, N) {
    const elements = [];
    while(elements.length < N) {
        elements.push(generator.call(null, L));
    }
    return elements;
}

function now() {
    return Date.now();
}

function getTime(baseTime) {
    return now() - baseTime;
}

function main(count, size) {
    console.log(`Sorting ${count} numbers of magnitude 10^${size}`);
    const numbers = generateList(generateNumber, size, count);
    const numBaseTime = now();
    numbers.sort();
    const numTime = getTime(numBaseTime);

    console.log(`Sorting ${count} strings of length ${size}`);
    const strings = generateList(generateString, size, count);
    const strBaseTime = now();
    strings.sort();
    const strTime = getTime(strBaseTime);

    console.log(`Numbers: ${numTime}\nStrings: ${strTime}`);
}

main(process.argv[2], process.argv[3]);

I was investigating a preconception I had that sorting strings in javascript would be slower than sorting integers.

的确如此,字符串比较比数字比较成本更高。

This is based on something I read which stated that javascript stores strings as Array<Array<int>> instead of just Array<int>. The MDN documentation seems to contradict this.

是的,你看的好像确实有误。字符串只是字符序列(每个字符都是一个 16 位值),因此它们通常存储为整数数组,或者 pointers to them。你的字符串数组确实可以被视为数组的数组。

When I ran a simple test, it turned out that the strings were about twice as fast to sort.

您的代码存在的问题是您将数字作为字符串进行排序,这会将每个数字转换为一个字符串,然后进行比较。参见 How to sort an array of integers correctly。当你解决这个问题时,请注意对比较函数的调用仍然对内置字符串比较有相当多的开销,所以如果你真的对关系运算符(<==>) 在不同的类型上我希望数字表现得更好。