指针在字符串内移动
Pointer moving inside string
我正在研究这段代码,但我不明白指针是如何在 buffer
内部移动的
...
while(fgets(buffer,buf_size,fp) != NULL){
read_line_p = malloc((strlen(buffer)+1)*sizeof(char));
strcpy(read_line_p,buffer);
char *string_field_in_read_line_p = strtok(read_line_p,",");
char *integer_field_in_read_line_p = strtok(NULL,",");
char *string_field_1 = malloc((strlen(string_field_in_read_line_p)+1)*sizeof(char));
char *string_field_2 = malloc((strlen(string_field_in_read_line_p)+1)*sizeof(char));
strcpy(string_field_1,string_field_in_read_line_p);
strcpy(string_field_2,string_field_in_read_line_p);
int integer_field = atoi(integer_field_in_read_line_p);
struct record *record_p = malloc(sizeof(struct record));
record_p->string_field = string_field_1;
record_p->integer_field = integer_field;
ordered_array_add(array, (void*)record_p);
free(read_line_p);
}
...
源代码是这样做的:
从 .csv
文件中读取数百万条记录,这些记录由一个字符串和一个整数组成,由 ,
分隔,每条记录都放在不同的行上;每条记录都作为单个元素添加到我们必须排序的通用数组中。通用数组由
表示
typedef struct {
void** array;
unsigned long el_num; //index
unsigned long array_capacity; //length
int (*precedes)(void*,void*); //precedence relation (name of a function in main which denota which one field we're comparing)
}OrderedArray;
在这个结构中,我们有一个 precedes 函数,它告诉我们是否必须按字符串字段或整数字段对数组进行排序。
csv 文件中的记录示例
第一个词,10
第二个字,9
第三个字,8
EC..
因此,在每次执行 ordered_array_add
时,我们都会在数组中插入一个新元素。
关注 ordered_array_add
void ordered_array_add(OrderedArray *ordered_array, void* element){
if(element == NULL){
fprintf(stderr,"add_ordered_array_element: element parameter cannot be NULL");
exit(EXIT_FAILURE);
}
if(ordered_array->el_num >= ordered_array->array_capacity){
ordered_array->array = realloc(ordered_array->array,2*(ordered_array->array_capacity)*sizeof(void*));
if(ordered_array->array == NULL){
fprintf(stderr,"ordered_array_add: unable to reallocate memory to host the new element");
exit(EXIT_FAILURE);
}
ordered_array->array_capacity = 2*ordered_array->array_capacity;
}
unsigned long index = get_index_to_insert(ordered_array, element);
insert_element(ordered_array,element,index);
(ordered_array->el_num)++;
}
我不明白第一个循环如何扫描字符串 buffer
,因为我在提到的循环中没有看到任何索引。
我写了一个与我发布的第一个循环类似的代码,问题是它在从 buffer
读取第一个单词后停止,而我正在研究的代码成功读取了整个字符串
while(fgets(buffer,buf_size,fp) != NULL) {
char *word = strtok(buffer, " ,.:");
add(words_to_correct, word);
words_to_correct->el_num = words_to_correct->el_num+1;
printf("%s\n", word);
}
您的整个第一个循环可以重写为:
while(fgets(buffer,buf_size,fp) != NULL){
// note how sizeof() is used - that way if the type of
// record_p changes, no changes to this code are needed
struct record *record_p = malloc(sizeof(*record_p));
// no need at all for temporary copies of the strings
record_p->string_field = strdup(strtok(buffer,","));
record_p->integer_field = atoi(strtok(NULL,","));
ordered_array_add(array, (void*)record_p);
}
不需要多次调用 malloc()
和 strcpy()
- 这对可以用 strdup()
替换 - 这都是 POSIX-standard and supported on Windows 所以它非常广泛可用.
当然,该代码需要错误检查和 it shouldn't be using atoi()
at all,但正如此处发布的那样,它会重复您的原始功能。
还有一个额外的好处,就是您实际上可以分辨出发生了什么。
您的代码
while(fgets(buffer,buf_size,fp) != NULL) {
char *word = strtok(buffer, " ,.:");
add(words_to_correct, word);
words_to_correct->el_num = words_to_correct->el_num+1;
printf("%s\n", word);
}
只会处理每行中的第一个单词 - 你需要继续调用 strtok()
直到它 returns NULL
,因为 strtok()
only returns one token:
while(fgets(buffer,buf_size,fp) != NULL) {
// trick to keep loop simple - start by using
// buffer on the first loop iteration, then
// set tmp to NULL so later iterations works too
char *tmp = buffer;
// loop until strtok() returns null
for ( ;; )
{
// note use of tmp here
char *word = strtok(tmp, " ,.:");
// line is fully parsed - break this loop
// and get the next line to parse
if (word == NULL)
{
break;
}
// now set tmp to NULL so next strtok()
// gets a NULL first parameter
tmp = NULL;
add(words_to_correct, word);
words_to_correct->el_num = words_to_correct->el_num+1;
printf("%s\n", word);
}
}
另请注意,我正在分散内容,而不是试图在每一行中填充尽可能多的代码。这通常更容易阅读。
我正在研究这段代码,但我不明白指针是如何在 buffer
...
while(fgets(buffer,buf_size,fp) != NULL){
read_line_p = malloc((strlen(buffer)+1)*sizeof(char));
strcpy(read_line_p,buffer);
char *string_field_in_read_line_p = strtok(read_line_p,",");
char *integer_field_in_read_line_p = strtok(NULL,",");
char *string_field_1 = malloc((strlen(string_field_in_read_line_p)+1)*sizeof(char));
char *string_field_2 = malloc((strlen(string_field_in_read_line_p)+1)*sizeof(char));
strcpy(string_field_1,string_field_in_read_line_p);
strcpy(string_field_2,string_field_in_read_line_p);
int integer_field = atoi(integer_field_in_read_line_p);
struct record *record_p = malloc(sizeof(struct record));
record_p->string_field = string_field_1;
record_p->integer_field = integer_field;
ordered_array_add(array, (void*)record_p);
free(read_line_p);
}
...
源代码是这样做的:
从 .csv
文件中读取数百万条记录,这些记录由一个字符串和一个整数组成,由 ,
分隔,每条记录都放在不同的行上;每条记录都作为单个元素添加到我们必须排序的通用数组中。通用数组由
typedef struct {
void** array;
unsigned long el_num; //index
unsigned long array_capacity; //length
int (*precedes)(void*,void*); //precedence relation (name of a function in main which denota which one field we're comparing)
}OrderedArray;
在这个结构中,我们有一个 precedes 函数,它告诉我们是否必须按字符串字段或整数字段对数组进行排序。
csv 文件中的记录示例
第一个词,10
第二个字,9
第三个字,8 EC..
因此,在每次执行 ordered_array_add
时,我们都会在数组中插入一个新元素。
关注 ordered_array_add
void ordered_array_add(OrderedArray *ordered_array, void* element){
if(element == NULL){
fprintf(stderr,"add_ordered_array_element: element parameter cannot be NULL");
exit(EXIT_FAILURE);
}
if(ordered_array->el_num >= ordered_array->array_capacity){
ordered_array->array = realloc(ordered_array->array,2*(ordered_array->array_capacity)*sizeof(void*));
if(ordered_array->array == NULL){
fprintf(stderr,"ordered_array_add: unable to reallocate memory to host the new element");
exit(EXIT_FAILURE);
}
ordered_array->array_capacity = 2*ordered_array->array_capacity;
}
unsigned long index = get_index_to_insert(ordered_array, element);
insert_element(ordered_array,element,index);
(ordered_array->el_num)++;
}
我不明白第一个循环如何扫描字符串 buffer
,因为我在提到的循环中没有看到任何索引。
我写了一个与我发布的第一个循环类似的代码,问题是它在从 buffer
读取第一个单词后停止,而我正在研究的代码成功读取了整个字符串
while(fgets(buffer,buf_size,fp) != NULL) {
char *word = strtok(buffer, " ,.:");
add(words_to_correct, word);
words_to_correct->el_num = words_to_correct->el_num+1;
printf("%s\n", word);
}
您的整个第一个循环可以重写为:
while(fgets(buffer,buf_size,fp) != NULL){
// note how sizeof() is used - that way if the type of
// record_p changes, no changes to this code are needed
struct record *record_p = malloc(sizeof(*record_p));
// no need at all for temporary copies of the strings
record_p->string_field = strdup(strtok(buffer,","));
record_p->integer_field = atoi(strtok(NULL,","));
ordered_array_add(array, (void*)record_p);
}
不需要多次调用 malloc()
和 strcpy()
- 这对可以用 strdup()
替换 - 这都是 POSIX-standard and supported on Windows 所以它非常广泛可用.
当然,该代码需要错误检查和 it shouldn't be using atoi()
at all,但正如此处发布的那样,它会重复您的原始功能。
还有一个额外的好处,就是您实际上可以分辨出发生了什么。
您的代码
while(fgets(buffer,buf_size,fp) != NULL) {
char *word = strtok(buffer, " ,.:");
add(words_to_correct, word);
words_to_correct->el_num = words_to_correct->el_num+1;
printf("%s\n", word);
}
只会处理每行中的第一个单词 - 你需要继续调用 strtok()
直到它 returns NULL
,因为 strtok()
only returns one token:
while(fgets(buffer,buf_size,fp) != NULL) {
// trick to keep loop simple - start by using
// buffer on the first loop iteration, then
// set tmp to NULL so later iterations works too
char *tmp = buffer;
// loop until strtok() returns null
for ( ;; )
{
// note use of tmp here
char *word = strtok(tmp, " ,.:");
// line is fully parsed - break this loop
// and get the next line to parse
if (word == NULL)
{
break;
}
// now set tmp to NULL so next strtok()
// gets a NULL first parameter
tmp = NULL;
add(words_to_correct, word);
words_to_correct->el_num = words_to_correct->el_num+1;
printf("%s\n", word);
}
}
另请注意,我正在分散内容,而不是试图在每一行中填充尽可能多的代码。这通常更容易阅读。