将大块二进制文件转换为 C 中的文本

Convert large block of binary into text in C

我有一个项目,我应该在其中通过 getchar() 函数获取文件并将其中的二进制字符转换为文本。

这是我的代码,一次只能为一个代码生成正确的 ASCII 码。我不知道如何读取整个文本文件的二进制值并进行转换:

#include <stdio.h>
#include <string.h>

typedef unsigned char byte;
typedef unsigned int uint;

int strbin_to_dec(const char *);

int main(void) {



char * wbin = "01001001";
  int c = 0;

  printf("%s to ascii %d.\n", wbin, strbin_to_dec(wbin));
  printf("The character is %c", strbin_to_dec(wbin));
  return 0;
}

int strbin_to_dec(const char * str) {
  uint result = 0;
  for (int i = strlen(str) - 1, j = 0; i >= 0; i--, j++) {
    byte k = str[i] - '0';
    k <<= j;
    result += k;
  }
  return result;
}

当我在变量 'wbin' 中输入一个字符的二进制值时,上面的代码有效,但我无法将其格式化为接受来自 getchar() 的输入,因为 getchar 给出了一个 int 类型。上面的代码产生结果:

01001001 to ascii 73.
The character is I

我要翻译的文件如下所示:

0010001001001000011011110111011100100000011011110110011001110100011001010110111000100000011010000110000101110110011001010010000001001001001000000111001101100001011010010110010000100000011101000110111100100000011110010110111101110101
011101000110100001100001011101000010000001110111011010000110010101101110001000000111100101101111011101010010000001101000011000010111011001100101001000000110010101101100011010010110110101101001011011100110000101110100011001010110010000100000011101000110100001100101001000000110100101101101011100000110111101110011011100110110100101100010011011000110010100101100
01110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100
01101101011101010111001101110100001000000110001001100101001000000111010001101000011001010010000001110100011100100111010101110100011010000011111100100010
0010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010

从文件中读取所有输入,并一次将 8 个字符传递给转换器函数以 return 一个字符。每个char为8位,文件中每个字符代表1位。

char string_to_character(char * in)
{
  char ret = 0;
  int i;
  for(i = 7; i >= 0; i--)
    if(in[i] == '1')
      ret += 1 << (7 - i);

  return ret;
}

此函数将文件中的每8个字符解码为一个字符。只需为整个输入字符串调用偏移量为 8 个字符的函数,并将结果保存在某处。

编辑:确保包括 link 数学库 math.h。 编辑 2:应该 >= 而不是 >

循环...

假设您将整个文件作为一个长字符串。

int num_chars = (sizeof(input) / sizeof(char)) / 8;
int i;
char output[num_chars + 1];
for(i = 0; i < num_chars; i++)
   output[i] = string_to_character(input + (i * 8));

printf("%s", output);

这是我从程序中得到的结果

“我多少次对你说过,当你排除了不可能的事情后,剩下的,无论多么不可能,都一定是真相?”-亚瑟爵士 柯南道尔,四符号

编辑:左移 vs pow

这是我做的功能!我不知道你是否使用文件作为执行参数,如:./text.exe -f binary.txt!但是我不向程序添加条目!文件是我自己定义的!

我已经创建了一个写入文件的函数,但是如果你想使用像 ./text.exe -f binary.txt > translatedfile.txt 这样的命令,你可以简单地删除功能write_to_file!不要忘记删除不需要的打印,因为参数“>”将打印所有内容!

代码:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void binary_to_char(char *str);
void write_to_file(char *text);

int main(void)
{

    printf("Getting line from file\n");

    FILE *file;
    char *line = NULL;
    size_t len = 0;
    ssize_t stringLength;

    file = fopen("binary.txt", "r");
    if (file == NULL)
    {
        fprintf(stderr, "[ERROR]: cannot open file -- binary.txt");
        perror("");
        exit(1);
    }

    while ((stringLength = getline(&line, &len, file)) != -1)
    {
        printf("\n%s", line);
        binary_to_char(line);
    }
    free(line);
    fclose(file);

    return 0;
}

void binary_to_char(char *str)
{
    char binary[9];
    char *text = malloc((strlen(str) + 1) * sizeof(char));
    char c;
    int pos = 0;
    int letter_pos = 0;
    printf("\nConverting into characters\n");
    for (size_t j = 0; j < strlen(str) / 8; j++)
    {
        for (int i = 0; i < 8; i++)
        {
            binary[i] = str[pos];
            pos++;
        }
        c = strtol(binary, 0, 2);
        text[letter_pos] = c;
        letter_pos++;
    }
    printf("\n%s\n", text);
    write_to_file(text);
    free(text);
}

void write_to_file(char *text)
{
    printf("\nContent saved to translatedfile.txt\n");
    FILE *fp;

    fp = fopen("translatedfile.txt", "w+");
    fprintf(fp, "%s", text);

    fclose(fp);
}

文件内容 binary.txt:

001000100100100001101111011101110010000001101111011001100111010001100101011011100010000001101000011000010111011001100101001000000100100100100000011100110110000101101001011001000010000001110100011011110010000001111001011011110111010101110100011010000110000101110100001000000111011101101000011001010110111000100000011110010110111101110101001000000110100001100001011101100110010100100000011001010110110001101001011011010110100101101110011000010111010001100101011001000010000001110100011010000110010100100000011010010110110101110000011011110111001101110011011010010110001001101100011001010010110001110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100011011010111010101110011011101000010000001100010011001010010000001110100011010000110010100100000011101000111001001110101011101000110100000111111001000100010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010

这是一项简单的任务,只需使用移位即可完成。此外,为了提高性能,下面使用 fread.

而不是使用 getchar

此实现使用最少的 RAM(不使用 malloc),没有缓慢的字符串解析或数学函数,例如 strlenstrtolpow,并且可以处理任何流无限 size/length,包括不是 8 字节倍数的截断流。

用法: ./a.out < data.txt > out.txt

#include <stdio.h>

int main(int argc, char * argv[])
{
  unsigned char byte = 0;
  int bits = 0;

  for(;;)
  {
    char buffer[1024];
    int len = fread(buffer, 1, sizeof(buffer), stdin);

    // if there was a read error or EOF, stop
    if (len <= 0)
      break;

    for(int i = 0; i < len; ++i)
    {
      switch(buffer[i])
      {
        // if a binary 1, turn on bit zero
        case '1':
          byte |= 1;
          break;

        // if a binary 0, do nothing
        case '0':
          break;

        // if antyhing else, skip
        default:
          continue;
      }

      // incrment the counter, if we dont yet have 8 bits
      // shift all the bits left by one
      if (++bits < 8)
        byte <<= 1;
      else
      {
        // write out the complete byte
        fwrite(&byte, 1, 1, stdout);

        // reset for the next byte
        bits = 0;
        byte = 0;
      }
    }
  }

  // write out any remaining data if the input was not a multiple of 8 in length.
  if (bits)
    fwrite(&byte, 1, 1, stdout);

  return 0;
}

输入:

0010001001001000011011110111011100100000011011110110011001110100011001010110111000100000011010000110000101110110011001010010000001001001001000000111001101100001011010010110010000100000011101000110111100100000011110010110111101110101
011101000110100001100001011101000010000001110111011010000110010101101110001000000111100101101111011101010010000001101000011000010111011001100101001000000110010101101100011010010110110101101001011011100110000101110100011001010110010000100000011101000110100001100101001000000110100101101101011100000110111101110011011100110110100101100010011011000110010100101100
01110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100
01101101011101010111001101110100001000000110001001100101001000000111010001101000011001010010000001110100011100100111010101110100011010000011111100100010
0010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010

输出:

"How often have I said to youthat when you have eliminated the impossible,whatever remains, however improbable,must be the truth?"-Sir Arthur Conan Doyle, The Sign Of Four