当一行中有 2 个分隔符时,将字符串拆分为 C 中的标记

Split string into Tokens in C, when there are 2 delimiters in a row

我正在使用 strtok() 函数将字符串拆分为 Tokens.The 问题是当行中有 2 个分隔符时。

/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,,";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str,", ");
  while (pch != NULL)
  {
    printf ("Token = %s\n",pch);
    pch = strtok (NULL, ", ");
  }
  return 0;
}

并输出:

Splitting string "Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,," into tokens:
Token = Test=
Token = 0.28
Token = 0.0
Token = 1
Token = 1.9
Token = 2.2
Token = 1.0
Token = 8
Token = 4
Token = 42

有一些简单的方法可以获取所有标记;我需要知道分隔符内是否有内容,因为有时我会得到 , or ,xxx,

谢谢。

strtok() 与您想要的完全相反。

在线手册中找到:

A sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter. Delimiter bytes at the start or end of the string are ignored. Put another way: the tokens returned by strtok() are always nonempty strings.

strtok(3) - Linux man page

我实现了 strtoke() - strtok() 的一个变体,它的行为相似但做你想做的事:

/* strtoke example */
#include <stdio.h>
#include <string.h>

/* behaves like strtok() except that it returns empty tokens also
 */
char* strtoke(char *str, const char *delim)
{
  static char *start = NULL; /* stores string str for consecutive calls */
  char *token = NULL; /* found token */
  /* assign new start in case */
  if (str) start = str;
  /* check whether text to parse left */
  if (!start) return NULL;
  /* remember current start as found token */
  token = start;
  /* find next occurrence of delim */
  start = strpbrk(start, delim);
  /* replace delim with terminator and move start to follower */
  if (start) *start++ = '[=10=]';
  /* done */
  return token;
}

int main ()
{
  char str[] ="Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,,";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtoke(str,", ");
  while (pch != NULL)
  {
    printf ("Token = %s\n",pch);
    pch = strtoke(NULL, ", ");
  }
  return 0;
}

在 cygwin 上用 gcc 编译和测试:

$ gcc -o test-strtok test-strtok.c

$ ./test-strtok.exe 
Splitting string "Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,," into tokens:
Token = Test=
Token = 0.28
Token = 0.0
Token = 1
Token = 
Token = 1.9
Token = 2.2
Token = 1.0
Token = 
Token = 8
Token = 4
Token = 
Token = 
Token = 42
Token = 
Token = 

上面的另一个引用link:

Be cautious when using these functions. If you do use them, note that:

  • These functions modify their first argument.
  • These functions cannot be used on constant strings.
  • The identity of the delimiting byte is lost.
  • The strtok() function uses a static buffer while parsing, so it's not thread safe. Use strtok_r() if this matters to you.

这些问题也适用于我的strtoke()