将大块二进制文件转换为 C 中的文本
Convert large block of binary into text in C
我有一个项目,我应该在其中通过 getchar() 函数获取文件并将其中的二进制字符转换为文本。
这是我的代码,一次只能为一个代码生成正确的 ASCII 码。我不知道如何读取整个文本文件的二进制值并进行转换:
#include <stdio.h>
#include <string.h>
typedef unsigned char byte;
typedef unsigned int uint;
int strbin_to_dec(const char *);
int main(void) {
char * wbin = "01001001";
int c = 0;
printf("%s to ascii %d.\n", wbin, strbin_to_dec(wbin));
printf("The character is %c", strbin_to_dec(wbin));
return 0;
}
int strbin_to_dec(const char * str) {
uint result = 0;
for (int i = strlen(str) - 1, j = 0; i >= 0; i--, j++) {
byte k = str[i] - '0';
k <<= j;
result += k;
}
return result;
}
当我在变量 'wbin' 中输入一个字符的二进制值时,上面的代码有效,但我无法将其格式化为接受来自 getchar() 的输入,因为 getchar 给出了一个 int 类型。上面的代码产生结果:
01001001 to ascii 73.
The character is I
我要翻译的文件如下所示:
0010001001001000011011110111011100100000011011110110011001110100011001010110111000100000011010000110000101110110011001010010000001001001001000000111001101100001011010010110010000100000011101000110111100100000011110010110111101110101
011101000110100001100001011101000010000001110111011010000110010101101110001000000111100101101111011101010010000001101000011000010111011001100101001000000110010101101100011010010110110101101001011011100110000101110100011001010110010000100000011101000110100001100101001000000110100101101101011100000110111101110011011100110110100101100010011011000110010100101100
01110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100
01101101011101010111001101110100001000000110001001100101001000000111010001101000011001010010000001110100011100100111010101110100011010000011111100100010
0010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010
从文件中读取所有输入,并一次将 8 个字符传递给转换器函数以 return 一个字符。每个char为8位,文件中每个字符代表1位。
char string_to_character(char * in)
{
char ret = 0;
int i;
for(i = 7; i >= 0; i--)
if(in[i] == '1')
ret += 1 << (7 - i);
return ret;
}
此函数将文件中的每8个字符解码为一个字符。只需为整个输入字符串调用偏移量为 8 个字符的函数,并将结果保存在某处。
编辑:确保包括 link 数学库 math.h。
编辑 2:应该 >= 而不是 >
循环...
假设您将整个文件作为一个长字符串。
int num_chars = (sizeof(input) / sizeof(char)) / 8;
int i;
char output[num_chars + 1];
for(i = 0; i < num_chars; i++)
output[i] = string_to_character(input + (i * 8));
printf("%s", output);
这是我从程序中得到的结果
“我多少次对你说过,当你排除了不可能的事情后,剩下的,无论多么不可能,都一定是真相?”-亚瑟爵士
柯南道尔,四符号
编辑:左移 vs pow
这是我做的功能!我不知道你是否使用文件作为执行参数,如:./text.exe -f binary.txt!但是我不向程序添加条目!文件是我自己定义的!
我已经创建了一个写入文件的函数,但是如果你想使用像 ./text.exe -f binary.txt > translatedfile.txt 这样的命令,你可以简单地删除功能write_to_file!不要忘记删除不需要的打印,因为参数“>”将打印所有内容!
代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void binary_to_char(char *str);
void write_to_file(char *text);
int main(void)
{
printf("Getting line from file\n");
FILE *file;
char *line = NULL;
size_t len = 0;
ssize_t stringLength;
file = fopen("binary.txt", "r");
if (file == NULL)
{
fprintf(stderr, "[ERROR]: cannot open file -- binary.txt");
perror("");
exit(1);
}
while ((stringLength = getline(&line, &len, file)) != -1)
{
printf("\n%s", line);
binary_to_char(line);
}
free(line);
fclose(file);
return 0;
}
void binary_to_char(char *str)
{
char binary[9];
char *text = malloc((strlen(str) + 1) * sizeof(char));
char c;
int pos = 0;
int letter_pos = 0;
printf("\nConverting into characters\n");
for (size_t j = 0; j < strlen(str) / 8; j++)
{
for (int i = 0; i < 8; i++)
{
binary[i] = str[pos];
pos++;
}
c = strtol(binary, 0, 2);
text[letter_pos] = c;
letter_pos++;
}
printf("\n%s\n", text);
write_to_file(text);
free(text);
}
void write_to_file(char *text)
{
printf("\nContent saved to translatedfile.txt\n");
FILE *fp;
fp = fopen("translatedfile.txt", "w+");
fprintf(fp, "%s", text);
fclose(fp);
}
文件内容 binary.txt:
001000100100100001101111011101110010000001101111011001100111010001100101011011100010000001101000011000010111011001100101001000000100100100100000011100110110000101101001011001000010000001110100011011110010000001111001011011110111010101110100011010000110000101110100001000000111011101101000011001010110111000100000011110010110111101110101001000000110100001100001011101100110010100100000011001010110110001101001011011010110100101101110011000010111010001100101011001000010000001110100011010000110010100100000011010010110110101110000011011110111001101110011011010010110001001101100011001010010110001110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100011011010111010101110011011101000010000001100010011001010010000001110100011010000110010100100000011101000111001001110101011101000110100000111111001000100010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010
这是一项简单的任务,只需使用移位即可完成。此外,为了提高性能,下面使用 fread
.
而不是使用 getchar
此实现使用最少的 RAM(不使用 malloc),没有缓慢的字符串解析或数学函数,例如 strlen
、strtol
或 pow
,并且可以处理任何流无限 size/length,包括不是 8 字节倍数的截断流。
用法:
./a.out < data.txt > out.txt
#include <stdio.h>
int main(int argc, char * argv[])
{
unsigned char byte = 0;
int bits = 0;
for(;;)
{
char buffer[1024];
int len = fread(buffer, 1, sizeof(buffer), stdin);
// if there was a read error or EOF, stop
if (len <= 0)
break;
for(int i = 0; i < len; ++i)
{
switch(buffer[i])
{
// if a binary 1, turn on bit zero
case '1':
byte |= 1;
break;
// if a binary 0, do nothing
case '0':
break;
// if antyhing else, skip
default:
continue;
}
// incrment the counter, if we dont yet have 8 bits
// shift all the bits left by one
if (++bits < 8)
byte <<= 1;
else
{
// write out the complete byte
fwrite(&byte, 1, 1, stdout);
// reset for the next byte
bits = 0;
byte = 0;
}
}
}
// write out any remaining data if the input was not a multiple of 8 in length.
if (bits)
fwrite(&byte, 1, 1, stdout);
return 0;
}
输入:
0010001001001000011011110111011100100000011011110110011001110100011001010110111000100000011010000110000101110110011001010010000001001001001000000111001101100001011010010110010000100000011101000110111100100000011110010110111101110101
011101000110100001100001011101000010000001110111011010000110010101101110001000000111100101101111011101010010000001101000011000010111011001100101001000000110010101101100011010010110110101101001011011100110000101110100011001010110010000100000011101000110100001100101001000000110100101101101011100000110111101110011011100110110100101100010011011000110010100101100
01110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100
01101101011101010111001101110100001000000110001001100101001000000111010001101000011001010010000001110100011100100111010101110100011010000011111100100010
0010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010
输出:
"How often have I said to youthat when you have eliminated the impossible,whatever remains, however improbable,must be the truth?"-Sir Arthur Conan Doyle, The Sign Of Four
我有一个项目,我应该在其中通过 getchar() 函数获取文件并将其中的二进制字符转换为文本。
这是我的代码,一次只能为一个代码生成正确的 ASCII 码。我不知道如何读取整个文本文件的二进制值并进行转换:
#include <stdio.h>
#include <string.h>
typedef unsigned char byte;
typedef unsigned int uint;
int strbin_to_dec(const char *);
int main(void) {
char * wbin = "01001001";
int c = 0;
printf("%s to ascii %d.\n", wbin, strbin_to_dec(wbin));
printf("The character is %c", strbin_to_dec(wbin));
return 0;
}
int strbin_to_dec(const char * str) {
uint result = 0;
for (int i = strlen(str) - 1, j = 0; i >= 0; i--, j++) {
byte k = str[i] - '0';
k <<= j;
result += k;
}
return result;
}
当我在变量 'wbin' 中输入一个字符的二进制值时,上面的代码有效,但我无法将其格式化为接受来自 getchar() 的输入,因为 getchar 给出了一个 int 类型。上面的代码产生结果:
01001001 to ascii 73.
The character is I
我要翻译的文件如下所示:
0010001001001000011011110111011100100000011011110110011001110100011001010110111000100000011010000110000101110110011001010010000001001001001000000111001101100001011010010110010000100000011101000110111100100000011110010110111101110101
011101000110100001100001011101000010000001110111011010000110010101101110001000000111100101101111011101010010000001101000011000010111011001100101001000000110010101101100011010010110110101101001011011100110000101110100011001010110010000100000011101000110100001100101001000000110100101101101011100000110111101110011011100110110100101100010011011000110010100101100
01110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100
01101101011101010111001101110100001000000110001001100101001000000111010001101000011001010010000001110100011100100111010101110100011010000011111100100010
0010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010
从文件中读取所有输入,并一次将 8 个字符传递给转换器函数以 return 一个字符。每个char为8位,文件中每个字符代表1位。
char string_to_character(char * in)
{
char ret = 0;
int i;
for(i = 7; i >= 0; i--)
if(in[i] == '1')
ret += 1 << (7 - i);
return ret;
}
此函数将文件中的每8个字符解码为一个字符。只需为整个输入字符串调用偏移量为 8 个字符的函数,并将结果保存在某处。
编辑:确保包括 link 数学库 math.h。 编辑 2:应该 >= 而不是 >
循环...
假设您将整个文件作为一个长字符串。
int num_chars = (sizeof(input) / sizeof(char)) / 8;
int i;
char output[num_chars + 1];
for(i = 0; i < num_chars; i++)
output[i] = string_to_character(input + (i * 8));
printf("%s", output);
这是我从程序中得到的结果
“我多少次对你说过,当你排除了不可能的事情后,剩下的,无论多么不可能,都一定是真相?”-亚瑟爵士 柯南道尔,四符号
编辑:左移 vs pow
这是我做的功能!我不知道你是否使用文件作为执行参数,如:./text.exe -f binary.txt!但是我不向程序添加条目!文件是我自己定义的!
我已经创建了一个写入文件的函数,但是如果你想使用像 ./text.exe -f binary.txt > translatedfile.txt 这样的命令,你可以简单地删除功能write_to_file!不要忘记删除不需要的打印,因为参数“>”将打印所有内容!
代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void binary_to_char(char *str);
void write_to_file(char *text);
int main(void)
{
printf("Getting line from file\n");
FILE *file;
char *line = NULL;
size_t len = 0;
ssize_t stringLength;
file = fopen("binary.txt", "r");
if (file == NULL)
{
fprintf(stderr, "[ERROR]: cannot open file -- binary.txt");
perror("");
exit(1);
}
while ((stringLength = getline(&line, &len, file)) != -1)
{
printf("\n%s", line);
binary_to_char(line);
}
free(line);
fclose(file);
return 0;
}
void binary_to_char(char *str)
{
char binary[9];
char *text = malloc((strlen(str) + 1) * sizeof(char));
char c;
int pos = 0;
int letter_pos = 0;
printf("\nConverting into characters\n");
for (size_t j = 0; j < strlen(str) / 8; j++)
{
for (int i = 0; i < 8; i++)
{
binary[i] = str[pos];
pos++;
}
c = strtol(binary, 0, 2);
text[letter_pos] = c;
letter_pos++;
}
printf("\n%s\n", text);
write_to_file(text);
free(text);
}
void write_to_file(char *text)
{
printf("\nContent saved to translatedfile.txt\n");
FILE *fp;
fp = fopen("translatedfile.txt", "w+");
fprintf(fp, "%s", text);
fclose(fp);
}
文件内容 binary.txt:
001000100100100001101111011101110010000001101111011001100111010001100101011011100010000001101000011000010111011001100101001000000100100100100000011100110110000101101001011001000010000001110100011011110010000001111001011011110111010101110100011010000110000101110100001000000111011101101000011001010110111000100000011110010110111101110101001000000110100001100001011101100110010100100000011001010110110001101001011011010110100101101110011000010111010001100101011001000010000001110100011010000110010100100000011010010110110101110000011011110111001101110011011010010110001001101100011001010010110001110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100011011010111010101110011011101000010000001100010011001010010000001110100011010000110010100100000011101000111001001110101011101000110100000111111001000100010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010
这是一项简单的任务,只需使用移位即可完成。此外,为了提高性能,下面使用 fread
.
getchar
此实现使用最少的 RAM(不使用 malloc),没有缓慢的字符串解析或数学函数,例如 strlen
、strtol
或 pow
,并且可以处理任何流无限 size/length,包括不是 8 字节倍数的截断流。
用法:
./a.out < data.txt > out.txt
#include <stdio.h>
int main(int argc, char * argv[])
{
unsigned char byte = 0;
int bits = 0;
for(;;)
{
char buffer[1024];
int len = fread(buffer, 1, sizeof(buffer), stdin);
// if there was a read error or EOF, stop
if (len <= 0)
break;
for(int i = 0; i < len; ++i)
{
switch(buffer[i])
{
// if a binary 1, turn on bit zero
case '1':
byte |= 1;
break;
// if a binary 0, do nothing
case '0':
break;
// if antyhing else, skip
default:
continue;
}
// incrment the counter, if we dont yet have 8 bits
// shift all the bits left by one
if (++bits < 8)
byte <<= 1;
else
{
// write out the complete byte
fwrite(&byte, 1, 1, stdout);
// reset for the next byte
bits = 0;
byte = 0;
}
}
}
// write out any remaining data if the input was not a multiple of 8 in length.
if (bits)
fwrite(&byte, 1, 1, stdout);
return 0;
}
输入:
0010001001001000011011110111011100100000011011110110011001110100011001010110111000100000011010000110000101110110011001010010000001001001001000000111001101100001011010010110010000100000011101000110111100100000011110010110111101110101
011101000110100001100001011101000010000001110111011010000110010101101110001000000111100101101111011101010010000001101000011000010111011001100101001000000110010101101100011010010110110101101001011011100110000101110100011001010110010000100000011101000110100001100101001000000110100101101101011100000110111101110011011100110110100101100010011011000110010100101100
01110111011010000110000101110100011001010111011001100101011100100010000001110010011001010110110101100001011010010110111001110011001011000010000001101000011011110111011101100101011101100110010101110010001000000110100101101101011100000111001001101111011000100110000101100010011011000110010100101100
01101101011101010111001101110100001000000110001001100101001000000111010001101000011001010010000001110100011100100111010101110100011010000011111100100010
0010110101010011011010010111001000100000010000010111001001110100011010000111010101110010001000000100001101101111011011100110000101101110001000000100010001101111011110010110110001100101001011000010000001010100011010000110010100100000010100110110100101100111011011100010000001001111011001100010000001000110011011110111010101110010
输出:
"How often have I said to youthat when you have eliminated the impossible,whatever remains, however improbable,must be the truth?"-Sir Arthur Conan Doyle, The Sign Of Four