Javascript，将一根绳子分成4份，剩下的留作一大份

Question

我正在为某事构建一个 Javascript 聊天机器人，但我运行遇到了一个问题：
我使用 string.split() 来标记我的输入，如下所示：
tokens = message.split(" ");

现在我的问题是我需要 4 个令牌来发出命令，1 个令牌来接收消息。当我这样做时： !finbot msg testuser 12345 Hello sir, this is a test message

这些是我得到的代币： ["!finbot", "msg", "testuser", "12345", "Hello", "sir,", "this", "is", "a", "test", "message"]

但是，我怎样才能让它变成这样： ["!finbot", "msg", "testuser", "12345", "Hello sir, this is a test message"]

我想要这样的原因是因为第一个令牌 (token[0]) 是调用，第二个 (token[1]) 是命令，第三个 (token[2])是用户，第四个 (token[3]) 是密码（因为它是受密码保护的消息...只是为了好玩），第五个 (token[4]) 是实际消息。
现在，它只会发送 Hello 因为我只使用第 5 个令牌。
我不能像 message = token[4] + token[5]; 这样的原因是因为消息并不总是恰好 3 个词，或者不完全是 4 个词等

希望我提供的信息足够您帮助我。如果你们知道答案（或知道更好的方法）请告诉我。

谢谢！

Answer 1

使用String.split的limit参数：

tokens = message.split(" ", 4);

从那里，您只需要从字符串中获取消息。将 this answer 重新用于其 nthIndex() 函数，您可以获得第 4 次出现的 space 字符的索引，并取其后的任何内容。

var message = message.substring(nthIndex(message, ' ', 4))

或者如果您在 tokens 数组中需要它：

tokens[4] = message.substring(nthIndex(message, ' ', 4))

Answer 2

我可能会像您一样先获取字符串，然后对其进行标记化：

const myInput = string.split(" "):

如果您使用的是 JS ES6，您应该可以执行以下操作：

const [call, command, userName, password, ...messageTokens] = myInput;
const message = messageTokens.join(" ");

但是，如果您无法访问传播运算符，您可以像这样执行相同的操作（它更加冗长）：

const call = myInput.shift();
const command = myInput.shift();
const userName = myInput.shift();
const password = myInput.shift();
const message = myInput.join(" ");

如果您再次需要将它们作为数组，现在您只需连接这些部分即可：

const output = [call, command, userName, password, message];

Answer 3

如果你可以使用 es6，你可以这样做：

let  [c1, c2, c3, c4, ...rest] = input.split (" ");
let msg = rest.join (" ");

Answer 4

鉴于您将格式定义为“4 个非 space 标记，用 space 分隔，后跟消息”，您可以恢复为正则表达式：

function tokenize(msg) {
    return (/^(\S+) (\S+) (\S+) (\S+) (.*)$/.exec(msg) || []).slice(1, 6);
}

如果您的 msg 实际上不符合规范，这可能会导致返回空数组的不良行为。如果不可接受，请删除 ... || [] 并进行相应处理。令牌的数量也固定为 4 + 所需的消息。对于更通用的方法，您可以：

function tokenizer(msg, nTokens) {
    var token = /(\S+)\s*/g, tokens = [], match;

    while (nTokens && (match = token.exec(msg))) {
        tokens.push(match[1]);
        nTokens -= 1; // or nTokens--, whichever is your style
    }

    if (nTokens) {
        // exec() returned null, could not match enough tokens
        throw new Error('EOL when reading tokens');
    }

    tokens.push(msg.slice(token.lastIndex));
    return tokens;
}

这使用 global feature of regexp objects in Javascript to test against the same string repeatedly and uses the lastIndex 属性在最后一个匹配的标记之后对其余部分进行切片。

给出

var msg = '!finbot msg testuser 12345 Hello sir, this is a test message';

然后

> tokenizer(msg, 4)
[ '!finbot',
  'msg',
  'testuser',
  '12345',
  'Hello sir, this is a test message' ]
> tokenizer(msg, 3)
[ '!finbot',
  'msg',
  'testuser',
  '12345 Hello sir, this is a test message' ]
> tokenizer(msg, 2)
[ '!finbot',
  'msg',
  'testuser 12345 Hello sir, this is a test message' ]

请注意，即使给定的消息字符串仅包含标记，空字符串也始终会附加到返回的数组中：

> tokenizer('asdf', 1)
[ 'asdf', '' ]  // An empty "message" at the end

Javascript，将一根绳子分成4份，剩下的留作一大份

Javascript, split a string in 4 pieces, and leave the rest as one big piece

javascript

arrays

tokenize