将 Pentium II 时序代码转换为内联汇编?
Convert Pentium II timing code into inline assembly?
我正在尝试在 GCC 中使用以下代码。它正在抛出错误(我猜是因为 __asm
)。为什么这种简单易用的格式在 GCC 中不起作用? Syntax of extended assembly 在这里提供。在内联汇编中使用更多变量时,我感到很困惑。有人可以将以下程序转换为适当的形式并在使用变量的地方给出必要的解释吗?
int time, subtime;
float x = 5.0f;
__asm {
cpuid
rdtsc
mov subtime, eax
cpuid
rdtsc
sub eax, subtime
mov subtime, eax // Only the last value of subtime is kept
// subtime should now represent the overhead cost of the
// MOV and CPUID instructions
fld x
fld x
cpuid // Serialize execution
rdtsc // Read time stamp to EAX
mov time, eax
fdiv // Perform division
cpuid // Serialize again for time-stamp read
rdtsc
sub eax, time // Find the difference
mov time, eax
}
。
gcc、icc 和 visual c,它们都有非常不同的内联汇编语法(这不是 C 标准的一部分)。 GCC 有点复杂,但也更高效,因为您告诉编译器哪些寄存器用于什么,哪些寄存器被破坏(使用)。
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
https://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
http://asm.sourceforge.net/articles/rmiyagi-inline-asm.txt
我的 gcc 汇编器有点生疏(我玩了几年),所以那里可能有一些错误
int main(int argc, char *argv[])
{
int time=0, subtime = 100;
const float x = 5.0f;
asm (
"xorl %%eax, %%eax \n" /* make sure eax is a known value befeore cpuid */
"cpuid \n"
"rdtsc \n"
"movl %%eax, %[aSubtime] \n"
"cpuid \n"
"rdtsc \n"
"subl %[aSubtime], %%eax \n"
// subtime should now represent the overhead cost of the
// MOV and CPUID instructions
"fld %[ax] \n"
"fld %[ax] \n"
"cpuid \n" // Serialize execution
"rdtsc \n" // Read time stamp to EAX
"movl %%eax, %[atime] \n"
"fdivp \n" // Perform division
"cpuid \n" // Serialize again for time-stamp read
"rdtsc \n"
"subl %[atime], %%eax \n"
// "movl %%eax, %2 \n" Not needed, since we tell the compiler that asm exists with time in eax
: "=a" (time) /* time is outputed in eax */
: [aSubtime] "m" (subtime),
[ax] "m" (x),
[atime] "m" (time)
: "ebx", "ecx", "edx"
);
/* FPU is currently left in a pushed state here */
return 0;
}
你的问题实际上是一个代码转换问题,对于 Whosebug 通常是题外话。然而,答案可能对其他读者有益。
此代码是原始源代码 material 的转换,并不意味着增强。实际的FDIV/FDIVP和FLD可以简化为一个FLD 和 FDIV/FDIVP 因为您要将浮点值除以自身。正如 Peter Cordes 指出的那样,您可以使用 FLD1 将值 1.0 加载到堆栈顶部。这是可行的,因为除以 any 数字本身(除了 0.0)将花费与除以 5.0 本身相同的时间。这将消除将变量 x
传递到 assembler 模板的需要。
您使用的代码是 documented by Intel 20 年前 Pentium II 的变体。描述了对该处理器正在发生的事情的讨论。不同之处在于您使用的代码不会执行该文档中描述的 预热 。我认为这种机制不会在现代处理器和操作系统上运行得过于出色(被警告)。
有问题的代码旨在测量 单个 FDIV instruction to complete. Assuming you actually want to convert this specific code you will have to use GCC extended assembler templates 所花费的时间。扩展的 assembler 模板对于初次使用 GCC 的开发者来说并不容易使用。对于 assembler 代码,您甚至可以考虑将代码放入单独的程序集文件中,assemble 单独,然后从 C.
调用它
汇编程序模板使用 input constraints and output constraints to pass data into and out of the template (unlike MSVC).It also uses a clobber list to specify registers that may have been altered that don't appear as an input or output. By default GCC inline assembly uses ATT syntax 而不是 INTEL。
使用带有 ATT 语法的扩展 assembler 的等效代码如下所示:
#include <stdio.h>
int main()
{
int time, subtime;
float x = 5.0f;
int temptime;
__asm__ (
"rdtsc\n\t"
"mov %%eax, %[subtime]\n\t"
"cpuid\n\t"
"rdtsc\n\t"
"sub %[subtime], %%eax\n\t"
"mov %%eax, %[subtime]\n\t"
/* Only the last value of subtime is kept
* subtime should now represent the overhead cost of the
* MOV and CPUID instructions */
"flds %[x]\n\t"
"flds %[x]\n\t" /* Alternatively use fst to make copy */
"cpuid\n\t" /* Serialize execution */
"rdtsc\n\t" /* Read time stamp to EAX */
"mov %%eax, %[temptime]\n\t"
"fdivp\n\t" /* Perform division */
"cpuid\n\t" /* Serialize again for time-stamp read */
"rdtsc\n\t"
"sub %[temptime], %%eax\n\t"
"fstp %%st(0)\n\t" /* Need to clear FPU stack before returning */
: [time]"=a"(time), /* 'time' is returned via the EAX register */
[subtime]"=r"(subtime), /* return reg for subtime */
[temptime]"=r"(temptime) /* Temporary reg for computation
This allows compiler to choose
a register for temporary use. Register
only for BOTH so subtime and temptime
calc are based on a mov reg, reg */
: [x]"m"(x) /* X is a MEMORY reference (required by FLD) */
: "ebx", "ecx", "edx"); /* Registers clobbered by CPUID
but not listed as input/output
operands */
time = time - subtime; /* Subtract the overhead */
printf ("%d\n", time); /* Print total time of divide to screen */
return 0;
}
我正在尝试在 GCC 中使用以下代码。它正在抛出错误(我猜是因为 __asm
)。为什么这种简单易用的格式在 GCC 中不起作用? Syntax of extended assembly 在这里提供。在内联汇编中使用更多变量时,我感到很困惑。有人可以将以下程序转换为适当的形式并在使用变量的地方给出必要的解释吗?
int time, subtime;
float x = 5.0f;
__asm {
cpuid
rdtsc
mov subtime, eax
cpuid
rdtsc
sub eax, subtime
mov subtime, eax // Only the last value of subtime is kept
// subtime should now represent the overhead cost of the
// MOV and CPUID instructions
fld x
fld x
cpuid // Serialize execution
rdtsc // Read time stamp to EAX
mov time, eax
fdiv // Perform division
cpuid // Serialize again for time-stamp read
rdtsc
sub eax, time // Find the difference
mov time, eax
}
。
gcc、icc 和 visual c,它们都有非常不同的内联汇编语法(这不是 C 标准的一部分)。 GCC 有点复杂,但也更高效,因为您告诉编译器哪些寄存器用于什么,哪些寄存器被破坏(使用)。
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
https://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
http://asm.sourceforge.net/articles/rmiyagi-inline-asm.txt
我的 gcc 汇编器有点生疏(我玩了几年),所以那里可能有一些错误
int main(int argc, char *argv[])
{
int time=0, subtime = 100;
const float x = 5.0f;
asm (
"xorl %%eax, %%eax \n" /* make sure eax is a known value befeore cpuid */
"cpuid \n"
"rdtsc \n"
"movl %%eax, %[aSubtime] \n"
"cpuid \n"
"rdtsc \n"
"subl %[aSubtime], %%eax \n"
// subtime should now represent the overhead cost of the
// MOV and CPUID instructions
"fld %[ax] \n"
"fld %[ax] \n"
"cpuid \n" // Serialize execution
"rdtsc \n" // Read time stamp to EAX
"movl %%eax, %[atime] \n"
"fdivp \n" // Perform division
"cpuid \n" // Serialize again for time-stamp read
"rdtsc \n"
"subl %[atime], %%eax \n"
// "movl %%eax, %2 \n" Not needed, since we tell the compiler that asm exists with time in eax
: "=a" (time) /* time is outputed in eax */
: [aSubtime] "m" (subtime),
[ax] "m" (x),
[atime] "m" (time)
: "ebx", "ecx", "edx"
);
/* FPU is currently left in a pushed state here */
return 0;
}
你的问题实际上是一个代码转换问题,对于 Whosebug 通常是题外话。然而,答案可能对其他读者有益。
此代码是原始源代码 material 的转换,并不意味着增强。实际的FDIV/FDIVP和FLD可以简化为一个FLD 和 FDIV/FDIVP 因为您要将浮点值除以自身。正如 Peter Cordes 指出的那样,您可以使用 FLD1 将值 1.0 加载到堆栈顶部。这是可行的,因为除以 any 数字本身(除了 0.0)将花费与除以 5.0 本身相同的时间。这将消除将变量 x
传递到 assembler 模板的需要。
您使用的代码是 documented by Intel 20 年前 Pentium II 的变体。描述了对该处理器正在发生的事情的讨论。不同之处在于您使用的代码不会执行该文档中描述的 预热 。我认为这种机制不会在现代处理器和操作系统上运行得过于出色(被警告)。
有问题的代码旨在测量 单个 FDIV instruction to complete. Assuming you actually want to convert this specific code you will have to use GCC extended assembler templates 所花费的时间。扩展的 assembler 模板对于初次使用 GCC 的开发者来说并不容易使用。对于 assembler 代码,您甚至可以考虑将代码放入单独的程序集文件中,assemble 单独,然后从 C.
调用它汇编程序模板使用 input constraints and output constraints to pass data into and out of the template (unlike MSVC).It also uses a clobber list to specify registers that may have been altered that don't appear as an input or output. By default GCC inline assembly uses ATT syntax 而不是 INTEL。
使用带有 ATT 语法的扩展 assembler 的等效代码如下所示:
#include <stdio.h>
int main()
{
int time, subtime;
float x = 5.0f;
int temptime;
__asm__ (
"rdtsc\n\t"
"mov %%eax, %[subtime]\n\t"
"cpuid\n\t"
"rdtsc\n\t"
"sub %[subtime], %%eax\n\t"
"mov %%eax, %[subtime]\n\t"
/* Only the last value of subtime is kept
* subtime should now represent the overhead cost of the
* MOV and CPUID instructions */
"flds %[x]\n\t"
"flds %[x]\n\t" /* Alternatively use fst to make copy */
"cpuid\n\t" /* Serialize execution */
"rdtsc\n\t" /* Read time stamp to EAX */
"mov %%eax, %[temptime]\n\t"
"fdivp\n\t" /* Perform division */
"cpuid\n\t" /* Serialize again for time-stamp read */
"rdtsc\n\t"
"sub %[temptime], %%eax\n\t"
"fstp %%st(0)\n\t" /* Need to clear FPU stack before returning */
: [time]"=a"(time), /* 'time' is returned via the EAX register */
[subtime]"=r"(subtime), /* return reg for subtime */
[temptime]"=r"(temptime) /* Temporary reg for computation
This allows compiler to choose
a register for temporary use. Register
only for BOTH so subtime and temptime
calc are based on a mov reg, reg */
: [x]"m"(x) /* X is a MEMORY reference (required by FLD) */
: "ebx", "ecx", "edx"); /* Registers clobbered by CPUID
but not listed as input/output
operands */
time = time - subtime; /* Subtract the overhead */
printf ("%d\n", time); /* Print total time of divide to screen */
return 0;
}