如果 Windows/Delphi/IDE 意味着 little-endian 顺序，我如何从 big-endian 二进制文件中读取整数？

Question

我很困惑。我需要读取二进制文件（Applied Biotechnology 的 .fsa 扩展名，又名 ABIF，FASTA 文件）并且我运行遇到读取有符号整数的问题。我正在按照本手册做所有事情 https://drive.google.com/file/d/1zL-r6eoTzFIeYDwH5L8nux2lIlsRx3CK/view?usp=sharing 因此，例如，让我们看一下文件 https://drive.google.com/file/d/1rrL01B_gzgBw28knvFit6hUIA5jcCDry/view?usp=sharing

的 header 中的 fDataSize 字段

我知道应该是2688（根据手册是32位的有符号整数），二进制形式是00000000 00000000 00001010 10000000。实际上，当我将这 32 位作为 4 个字节的数组读取时，我得到 [0, 0, 10, -128]，这与二进制形式完全相同。

但是，如果我将其读取为整数，则结果为 16809994，即 00000001 00000000 10000000 00001010 位。

正如我从多个论坛了解到的那样，他们使用 Swap 和 htonl 函数将整数从 little-endian 顺序转换为 big-endian。他们还建议对 32 位整数使用 BSWAP EAX 指令。但在这种情况下，它们以一种错误的方式工作，具体来说： Swap，应用于16809994，returns 16779904 or 00000001 00000000 00001010 10000000，BSWAP指令将16809994转换为176160769，即00001010 10000000=0000000[0000000]0000010

如我们所见，built-in 函数做了一些与我需要的不同的事情。交换可能 return 正确的结果，但是，出于某种原因，将这些位作为整数读取会更改 left-most 字节。那么，出了什么问题，我该怎么办？

更新。 1 为了存储 header 数据，我使用以下记录：

type
  TFasMainHeader = record
    fFrmt        : array[1..4]  of ansiChar;
    fVersion     : Word;
    fDir         : array[1..4] of ansiChar;
    fNumber      : array[1..4]  of Byte; //
    fElType      : Word;
    fElSize      : Word;
    fNumEls      : array[1..4]  of Byte; //
    fDataSize    : Integer;
    fDataOffset  : Integer;
    fDO : word;
    fDataHandle  : array[1..98]  of Byte;
  end;

然后在单击按钮时执行以下操作：

aFileStream.Read(fas_main_header, SizeOf(TFasMainHeader));
with fas_main_header do begin
    if fFrmt <> 'ABIF' then raise Exception.Create('Not an ABIF file!');
    fVersion := Swap(fVersion);
    fElType := Swap(fElType);
    fElSize := Swap(fElSize);
...

接下来需要正确的交换Int32变量，但此时fDataSize，例如16809994，调试时详细查看记录状态：

这对我来说没有意义，因为 fDataSize 值的二进制表示中不应该有 one-bit（它还会破坏 BSWAP 结果）。

查看文件开头的二进制结构（fDataSize 字节高亮显示）：

Answer 1

这是一个使用纯 pascal 的实现示例：

program FasDemo;
{$APPTYPE CONSOLE}
uses
  System.SysUtils, System.Classes;

type
  TFasInt16 = packed record
    B0, B1 : Byte;
    function ToUInt32 : UInt32;
    function ToInt32  : Int32;
    class operator Implicit(A: TFasInt16): Integer;      // Implicit conversion of TFasInt16 to Integer
    class operator Implicit(A: Integer)  : TFasInt16;    // Implicit conversion of Integer   to TFasInt16
  end;
  TFasInt32 = packed record
    W0, W1 : TFasInt16;
    function ToUInt32 : UInt32;
    function ToInt32  : Int32;
    class operator Implicit(A: TFasInt32): Integer;      // Implicit conversion of TFasInt32 to Integer
    class operator Implicit(A: Integer)  : TFasInt32;    // Implicit conversion of Integer to TFasInt32
  end;


function TFasInt16.ToUInt32: UInt32;
begin
  Result := (B0 shl 8) + B1;
end;

function TFasInt16.ToInt32: Int32;
begin
  Result := Int16(B0 shl 8) + B1;
end;

class operator TFasInt16.Implicit(A: Integer): TFasInt16;
begin
  Result.B1 := Byte(A);
  Result.B0 := Byte(A shr 8);
end;

class operator TFasInt16.Implicit(A: TFasInt16): Integer;
begin
  Result := A.ToInt32;
end;

function TFasInt32.ToUInt32: UInt32;
begin
  Result := (W0.ToUInt32 shl 16) + W1.ToUInt32;
end;

function TFasInt32.ToInt32: Int32;
begin
  Result := (W0.ToUInt32 shl 16) + W1.ToUInt32;
end;

class operator TFasInt32.Implicit(A: TFasInt32): Integer;
begin
  Result := A.ToInt32;
end;

class operator TFasInt32.Implicit(A: Integer): TFasInt32;
begin
  Result.W1 := Int16(A);
  Result.W0 := Int16(A shr 16);
end;

var
  Stream   : TFileStream;
  FasInt32 : TFasInt32;
  FasInt16 : TFasInt16;
  AInteger : Integer;
begin
  Stream := TFileStream.Create('C:\Users\fpiette\Downloads\A02-RD12-0002-35-0.5PP16-001.5sec.fsa', fmOpenRead);
  try
    Stream.Position := ;
    Stream.Read(FasInt32, SizeOf(FasInt32));
    WriteLn(FasInt32.W1.ToUInt32, ' 0x', IntToHex(FasInt32.W1.ToUInt32, 8));
    WriteLn(FasInt32.W1.ToInt32,  ' 0x', IntToHex(FasInt32.W1.ToInt32,  8));
    WriteLn(FasInt32.ToUInt32,    ' 0x', IntToHex(FasInt32.ToUInt32,    8));
    WriteLn(FasInt32.ToInt32,     ' 0x', IntToHex(FasInt32.ToInt32,     8));

    WriteLn;
    WriteLn('Test implicit conversion 16 bits to integer ');
    AInteger := FasInt32.W1;
    WriteLn(AInteger,             ' 0x', IntToHex(AInteger,     8));

    WriteLn;
    WriteLn('Test implicit conversion 32 bits to integer ');
    AInteger := FasInt32;
    WriteLn(AInteger,             ' 0x', IntToHex(AInteger,     8));

    WriteLn;
    WriteLn('Test implicit conversion 16 bits from integer');
    FasInt16 := 1234;
    WriteLn(FasInt16.ToInt32,     ' 0x', IntToHex(FasInt16.ToInt32,  8));
    FasInt16 := -1234;
    WriteLn(FasInt16.ToInt32,     ' 0x', IntToHex(FasInt16.ToInt32,  8));

    WriteLn;
    WriteLn('Test implicit conversion 32 bits from integer');
    FasInt32 := 12345678;
    WriteLn(FasInt32.ToInt32,     ' 0x', IntToHex(FasInt32.ToInt32,     8));
    FasInt32 := -12345678;
    WriteLn(FasInt32.ToInt32,     ' 0x', IntToHex(FasInt32.ToInt32,     8));

    ReadLn;
  finally
    FreeAndNil(Stream);
  end;
end.

你可以添加，如果你的 Delphi 版本支持它，添加内联指令。

我使用运算符重载进行了隐式转换 to/from 整数。使用它可以在不调用转换例程的情况下使用类型：编译器为我们完成了工作！

当然可以添加其他运算符重载，你懂的。

要访问 FAS header 和其他结构，您可以使用类型 TFasInt32 和 TFasInt16 而不是 Word 和 Integer。其余的代码只是有它不是 big-endian！编译器会自动来回转换为本机整数 (little-endian)。

Answer 2

问题与字节顺序无关，而是与 Delphi records.

你有

type
  TFasMainHeader = record
    fFrmt        : array[1..4]  of ansiChar;
    fVersion     : Word;
    fDir         : array[1..4] of ansiChar;
    fNumber      : array[1..4]  of Byte; //
    fElType      : Word;
    fElSize      : Word;
    fNumEls      : array[1..4]  of Byte; //
    fDataSize    : Integer;
    fDataOffset  : Integer;
    fDO : word;
    fDataHandle  : array[1..98]  of Byte;
  end;

并且您希望此记录覆盖文件中的字节，fDataSize“位于”00 00 0A 80。

但是 Delphi 编译器会在记录的字段之间添加填充以使它们正确对齐。因此，您的 fDataSize 将不会处于正确的偏移量。

要解决此问题，请使用 packed 关键字：

type
  TFasMainHeader = packed record
    fFrmt        : array[1..4]  of ansiChar;
    fVersion     : Word;
    fDir         : array[1..4] of ansiChar;
    fNumber      : array[1..4]  of Byte; //
    fElType      : Word;
    fElSize      : Word;
    fNumEls      : array[1..4]  of Byte; //
    fDataSize    : Integer;
    fDataOffset  : Integer;
    fDO : word;
    fDataHandle  : array[1..98]  of Byte;
  end;

然后字段将位于预期位置。

然后——当然——你可以使用任何你喜欢的方法来交换字节顺序。

最好是 BSWAP 指令。

如果 Windows/Delphi/IDE 意味着 little-endian 顺序，我如何从 big-endian 二进制文件中读取整数？

How do I read integers from big-endian binary file if Windows/Delphi/IDE implies little-endian order?

delphi

delphi-7

endianness