使用 Javascript 将标记化字符串转换为整数

Convert a tokenized string into integers by using Javascript

我看过这个 但因为它在 python 中,所以我想问一个类似的问题。 如果不使用库,我将如何使用以下格式的标记化字符串数组:

[["hi","how","are", "you"], ["how", "are", "you", "doing"]] 

如果我有下面显示的字典,我将如何创建一个与标记化数组具有相同格式的数组,而不是字符串,我将有一个整数来表示它在字典中的位置?

["how","hi","doing"]

所以输出看起来像这样:

[[2,1,0,0],[1,0,0,3]]

我会先将第二个数组转换为一个对象,这样你就可以在常数时间内进行查找:

function translate(input, reference) {
    let map = Object.fromEntries(reference.map((ref, i) => [ref, i+1]));
    return input.map(phrase => phrase.map(word => map[word] || 0));
}

// Demo
let res = translate([["hi","how","are","you"], ["how","are","you","doing"]], 
                    ["how","hi","doing"]);
console.log(res);

使用mapindexOf方法

arr = [
  ["hi", "how", "are", "you"],
  ["how", "are", "you", "doing"],
];

// your input is array in javascript (not a dictionary)
const keys = ["how", "hi", "doing"];

const res = arr.map((arr) => arr.map((word) => keys.indexOf(word) + 1));

console.log(res)