将加载的目标地址保存在寄存器中,直到指令退出
Keep target address of load in register until instruction is retired
我想使用基于事件的精确采样 (PEBS) 在 XeonE5 Sandy Bridge 上记录特定事件的所有地址(例如缓存未命中)。
但是,CoreTM i7 处理器和 Intel® XeonTM 5500 处理器 的性能分析指南第 24 页包含以下内容警告:
As the PEBS mechanism captures the values of the register at
completion of the instruction, the dereferenced address for the
following type of load instruction (Intel asm convention) cannot be
reconstructed.
MOV RAX, [RAX+const]
This kind of
instruction is mostly associated with pointer chasing
mystruc = mystruc->next;
This is a significant shortcoming of this
approach to capturing memory instruction addresses.
根据 objdump,我的程序中有许多这种形式的加载指令。 有什么办法可以避免这些吗?
由于这是一个特定于英特尔的问题,解决方案不必以任何方式移植,它只需要工作。我的代码是用 C 编写的,理想情况下我正在寻找编译器级别的解决方案(gcc 或 icc),但欢迎提出任何建议。
一些例子:
mov 0x18(%rdi),%rdi
mov (%rcx,%rax,8),%rax
在这两种情况下,在指令退出后(因此当我查看寄存器值以确定我加载的位置时 to/from)地址的值(分别为 %rdi + 18
和 %rcx + 8 * %rax
在这些示例中)被 mov
.
的结果覆盖
我现在能想到的唯一方法是使用 &(& 符号)汇编器约束。这意味着我将不得不在出现此类指令的任何地方检查我的代码,并将每个取消引用 mystruc = mystruc->next;
的指针替换为:
asm volatile("mov (%1),%0" : "=&r" (mystruc) : "r" (&(mystruc->next)))
然而,这是一种非常繁琐的方法,并且在某些情况下可能比结构中的指针更复杂。我知道这基本上是在增加寄存器压力,因此编译器正在积极尝试避免这种情况。还有其他方法吗?
您想做的是转换所有形式的指令:
mov (%rcx,%rax,8),%rax
进入:
mov (%rcx,%rax,8),%r11
mov %r11,%rax
这可以通过修改编译器生成的汇编源代码更容易地完成。下面是一个 perl
脚本,它将通过读取和修改 .s
文件来完成所有必要的转换。
只需更改构建以生成 .s
文件而不是 .o
文件,应用脚本,然后使用 as
或 [=22] 生成 .o
=]
这是实际的脚本。我已经按照下面评论中的构建过程在我自己的一些来源上对其进行了测试。
该脚本具有以下特点:
- 扫描并定位所有函数定义
- 识别给定函数中使用的所有寄存器
- 找到函数的所有return点
- 根据函数的寄存器使用情况选择要使用的临时寄存器(即它将使用未已被函数使用的临时寄存器)
- 用两个指令序列替换所有“麻烦的”指令
- 在尝试使用被调用者保存的寄存器
之前尝试使用未使用的临时寄存器(例如%r11
或未使用的参数寄存器)
- 如果选择的寄存器被callee保存,将添加
push
到函数序言和pop
到函数[多个]ret
语句
- 维护所有分析和转换的日志并将其作为注释附加到输出
.s
文件
#!/usr/bin/perl
# pebsfix/pebsfixup -- fix assembler source for PEBS usage
#
# command line options:
# "-a" -- use only full 64 bit targets
# "-l" -- do _not_ use lea
# "-D[diff-file]" -- show differences (default output: "./DIFF")
# "-n10" -- do _not_ use register %r10 for temporary (default is use it)
# "-o" -- overwrite input files (can be multiple)
# "-O<outfile>" -- output file (only one .s input allowed)
# "-q" -- suppress warnings
# "-T[lvl]" -- debug trace
#
# "-o" and "-O" are mutually exclusive
#
# command line script test options:
# "-N[TPA]" -- disable temp register types [for testing]
# "-P" -- force push/pop on all functions
#
# command line arguments:
# 1-- list of .s files to process [or directory to search]
# for a given file "foo.s", output is to "foo.TMP"
# if (-o is given, "foo.TMP" is renamed to "foo.s")
#
# suggested usage:
# change build to produce .s files
# FROM:
# cc [options] -c foo.c
# TO:
# cc [options] -c -S foo.c
# pebsfixup -o foo.s
# cc -c foo.s
#
# suggested compiler options:
# [probably only really needed if push/pop required. use -NP to verify]
# (1) use either of
# -O2 -fno-optimize-sibling-calls
# -O1
# (2) use -mno-omit-leaf-frame-pointer
# (3) use -mno-red-zone [probably not required in any case]
#
# NOTES:
# (1) red zones are only really useful for leaf functions (i.e. if fncA calls
# fncB, fncA's red zone would be clobbered)
# (2) pushing onto the stack isn't a problem if there is a formal stack frame
# (3) the push is okay if the function has no more than six arguments (i.e.
# does _not_ use positive offsets from %rsp to access them)
#pragma pgmlns
use strict qw(vars subs);
our $pgmtail;
our $opt_a;
our $opt_T;
our $opt_D;
our $opt_l;
our $opt_n10;
our $opt_N;
our $opt_P;
our $opt_q;
our $opt_o;
our $opt_O;
our $opt_s;
our @reguse;
our %reguse_tobase;
our %reguse_isbase;
our $regusergx;
our @regtmplist;
our %regtmp_type;
our $diff;
our $sepflg;
our $fatal;
our @cmtprt;
master(@ARGV);
exit(0);
# master -- master control
sub master
{
my(@argv) = @_;
my($xfsrc);
my($file,@files);
my($bf);
$pgmtail = "pebsfixup";
optget(\@argv);
# define all known/usable registers
regusejoin();
# define all registers that we may use as a temporary
regtmpall();
if (defined($opt_D)) {
unlink($opt_D);
}
# show usage
if (@argv <= 0) {
$file = [=12=];
open($xfsrc,"<$file") ||
sysfault("$pgmtail: unable to open '%s' -- $!\n",$file);
while ($bf = <$xfsrc>) {
chomp($bf);
next if ($bf =~ /^#!/);
last unless ($bf =~ s/^#//);
$bf =~ s/^# ?//;
print($bf,"\n");
}
close($xfsrc);
exit(1);
}
foreach $file (@argv) {
if (-d $file) {
dodir(\@files,$file);
}
else {
push(@files,$file);
}
}
if (defined($opt_O)) {
sysfault("$pgmtail: -O may have only one input file\n")
if (@files != 1);
sysfault("$pgmtail: -O and -o are mutually exclusive\n")
if ($opt_o);
}
foreach $file (@files) {
dofile($file);
}
if (defined($opt_D)) {
exec("less",$opt_D);
}
}
# dodir -- process directory
sub dodir
{
my($files,$dir) = @_;
my($file,@files);
@files = (`find $dir -type f -name '*.s'`);
foreach $file (@files) {
chomp($file);
push(@$files,$file);
}
}
# dofile -- process file
sub dofile
{
my($file) = @_;
my($ofile);
my($xfsrc);
my($xfdst);
my($bf,$lno,$outoff);
my($fixoff);
my($lhs,$rhs);
my($xop,$arg);
my($ix);
my($sym,$val,$typ);
my(%sym_type);
my($fnc,$fnx,%fnx_lookup,@fnxlist);
my($retlist);
my($uselook,@uselist,%avail);
my($fixreg,$fixrtyp);
my($sixlist);
my($fix,$fixlist);
my($fixtot);
my(@fix);
my(@outlist);
my($relaxflg);
my($cmtchr);
undef($fatal);
undef(@cmtprt);
msgprt("\n")
if ($sepflg);
$sepflg = 1;
msgprt("$pgmtail: processing %s ...\n",$file);
$cmtchr = "#";
cmtprt("%s\n","-" x 78);
cmtprt("FILE: %s\n",$file);
# get the output file
$ofile = $file;
sysfault("$pgmtail: bad suffix -- file='%s'\n",$file)
unless ($ofile =~ s/[.]s$//);
$ofile .= ".TMP";
# use explicit output file
if (defined($opt_O)) {
$ofile = $opt_O;
sysfault("$pgmtail: output file may not be input file -- use -o instead\n")
if ($ofile eq $file);
}
open($xfsrc,"<$file") ||
sysfault("$pgmtail: unable to open '%s' -- $!\n",$file);
$lno = 0;
while ($bf = <$xfsrc>) {
chomp($bf);
$bf =~ s/\s+$//;
$outoff = $lno;
++$lno;
push(@outlist,$bf);
# clang adds comments
$ix = index($bf,"#");
if ($ix >= 0) {
$bf = substr($bf,0,$ix);
$bf =~ s/\s+$//;
}
# look for ".type blah, @function"
# NOTE: this always comes before the actual label line [we hope ;-)]
if ($bf =~ /^\s+[.]type\s+([^,]+),\s*(\S+)/) {
($sym,$val) = (,);
$val =~ s/^\@//;
$sym_type{$sym} = $val;
cmtprt("\n");
cmtprt("TYPE: %s --> %s\n",$sym,$val);
next;
}
# look for "label:"
if ($bf =~ /^([a-z_A-Z][a-z_A-Z0-9]*):$/) {
$sym = ;
next if ($sym_type{$sym} ne "function");
$fnc = $sym;
cmtprt("FUNCTION: %s\n",$fnc);
$fnx = {};
$fnx_lookup{$sym} = $fnx;
push(@fnxlist,$fnx);
$fnx->{fnx_fnc} = $fnc;
$fnx->{fnx_outoff} = $outoff;
$uselook = {};
$fnx->{fnx_used} = $uselook;
$retlist = [];
$fnx->{fnx_retlist} = $retlist;
$fixlist = [];
$fnx->{fnx_fixlist} = $fixlist;
$sixlist = [];
$fnx->{fnx_sixlist} = $sixlist;
next;
}
# remember all registers used by function:
while ($bf =~ /($regusergx)/gpo) {
$sym = ${^MATCH};
$val = $reguse_tobase{$sym};
dbgprt(3,"dofile: REGUSE sym='%s' val='%s'\n",$sym,$val);
$uselook->{$sym} += 1;
$uselook->{$val} += 1
if ($val ne $sym);
}
# handle returns
if ($bf =~ /^\s+ret/) {
push(@$retlist,$outoff);
next;
}
if ($bf =~ /^\s+rep[a-z]*\s+ret/) {
push(@$retlist,$outoff);
next;
}
# split up "movq 16(%rax),%rax" ...
$ix = rindex($bf,",");
next if ($ix < 0);
# ... into "movq 16(%rax)"
$lhs = substr($bf,0,$ix);
$lhs =~ s/\s+$//;
# check for "movq 16(%rsp)" -- this means that the function has/uses
# more than six arguments (i.e. we may _not_ push/pop because it
# wreaks havoc with positive offsets)
# FIXME/CAE -- we'd have to adjust them by 8 which we don't do
(undef,$rhs) = split(" ",$lhs);
if ($rhs =~ /^(\d+)[(]%rsp[)]$/) {
push(@$sixlist,$outoff);
cmtprt("SIXARG: %s (line %d)\n",$rhs,$lno);
}
# ... and "%rax"
$rhs = substr($bf,$ix + 1);
$rhs =~ s/^\s+//;
# target must be a [simple] register [or source scan will blow up]
# (e.g. we actually had "cmp %ebp,(%rax,%r14)")
next if ($rhs =~ /[)]/);
# ensure we have the "%" prefix
next unless ($rhs =~ /^%/);
# we only want the full 64 bit reg as target
# (e.g. "mov (%rbx),%al" doesn't count)
$val = $reguse_tobase{$rhs};
if ($opt_a) {
next if ($val ne $rhs);
}
else {
next unless (defined($val));
}
# source operand must contain target [base] register
next unless ($lhs =~ /$val/);
###cmtprt("1: %s,%s\n",$lhs,$rhs);
# source operand must be of the "right" type
# FIXME/CAE -- we may need to revise this
next unless ($lhs =~ /[(]/);
cmtprt("NEEDFIX: %s,%s (line %d)\n",$lhs,$rhs,$lno);
# remember the place we need to fix for later
$fix = {};
push(@$fixlist,$fix);
$fix->{fix_outoff} = $outoff;
$fix->{fix_lhs} = $lhs;
$fix->{fix_rhs} = $rhs;
}
close($xfsrc);
# get total number of fixups
foreach $fnx (@fnxlist) {
$fixlist = $fnx->{fnx_fixlist};
$fixtot += @$fixlist;
}
msgprt("$pgmtail: needs %d fixups\n",$fixtot)
if ($fixtot > 0);
# fix each function
foreach $fnx (@fnxlist) {
cmtprt("\n");
cmtprt("FNC: %s\n",$fnx->{fnx_fnc});
$fixlist = $fnx->{fnx_fixlist};
# get the fixup register
($fixreg,$fixrtyp) = regtmploc($fnx,$fixlist);
# show number of return points
{
$retlist = $fnx->{fnx_retlist};
cmtprt(" RET: %d\n",scalar(@$retlist));
last if (@$retlist >= 1);
# NOTE: we display this warning because we may not be able to
# handle all situations
$relaxflg = (@$fixlist <= 0) || ($fixrtyp ne "P");
last if ($relaxflg && $opt_q);
errprt("$pgmtail: in file '%s'\n",$file);
errprt("$pgmtail: function '%s' has no return points\n",
$fnx->{fnx_fnc});
errprt("$pgmtail: suggest recompile with correct options\n");
if (@$fixlist <= 0) {
errprt("$pgmtail: working around because function needs no fixups\n");
last;
}
if ($fixrtyp ne "P") {
errprt("$pgmtail: working around because fixup reg does not need to be saved\n");
last;
}
}
# show stats on register usage in function
$uselook = $fnx->{fnx_used};
@uselist = sort(keys(%$uselook));
cmtprt(" USED:\n");
%avail = %reguse_isbase;
foreach $sym (@uselist) {
$val = $uselook->{$sym};
$typ = $regtmp_type{$sym};
$typ = sprintf(" (TYPE: %s)",$typ)
if (defined($typ));
cmtprt(" %s used %d%s\n",$sym,$val,$typ);
$val = $reguse_tobase{$sym};
delete($avail{$val});
}
# show function's available [unused] registers
@uselist = keys(%avail);
@uselist = sort(regusesort @uselist);
if (@uselist > 0) {
cmtprt(" AVAIL:\n");
foreach $sym (@uselist) {
$typ = $regtmp_type{$sym};
$typ = sprintf(" (TYPE: %s)",$typ)
if (defined($typ));
cmtprt(" %s%s\n",$sym,$typ);
}
}
# skip over any functions that don't need fixing _and_ have a temp
# register
if (@$fixlist <= 0 && (! $opt_P)) {
next if (defined($fixreg));
}
msgprt("$pgmtail: function %s\n",$fnx->{fnx_fnc});
# skip function because we don't have a fixup register but report it
# here
unless (defined($fixreg)) {
$bf = (@$fixlist > 0) ? "FATAL" : "can be ignored -- no fixups needed";
msgprt("$pgmtail: FIXNOREG (%s)\n",$bf);
cmtprt(" FIXNOREG (%s)\n",$bf);
next;
}
msgprt("$pgmtail: FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp);
cmtprt(" FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp);
foreach $fix (@$fixlist) {
$outoff = $fix->{fix_outoff};
undef(@fix);
cmtprt(" FIXOLD %s\n",$outlist[$outoff]);
# original
if ($opt_l) {
$bf = sprintf("%s,%s",$fix->{fix_lhs},$fixreg);
push(@fix,$bf);
$bf = sprintf("\tmov\t%s,%s",$fixreg,$fix->{fix_rhs});
push(@fix,$bf);
}
# use lea
else {
($xop,$arg) = split(" ",$fix->{fix_lhs});
$bf = sprintf("\tlea\t\t%s,%s",$arg,$fixreg);
push(@fix,$bf);
$bf = sprintf("\t%s\t(%s),%s",$xop,$fixreg,$fix->{fix_rhs});
push(@fix,$bf);
}
foreach $bf (@fix) {
cmtprt(" FIXNEW %s\n",$bf);
}
$outlist[$outoff] = [@fix];
}
unless ($opt_P) {
next if ($fixrtyp ne "P");
}
# fix the function prolog
$outoff = $fnx->{fnx_outoff};
$lhs = $outlist[$outoff];
$rhs = sprintf("\tpush\t%s",$fixreg);
$bf = [$lhs,$rhs,""];
$outlist[$outoff] = $bf;
# fix the function return points
$retlist = $fnx->{fnx_retlist};
foreach $outoff (@$retlist) {
$rhs = $outlist[$outoff];
$lhs = sprintf("\tpop\t%s",$fixreg);
$bf = ["",$lhs,$rhs];
$outlist[$outoff] = $bf;
}
}
open($xfdst,">$ofile") ||
sysfault("$pgmtail: unable to open '%s' -- $!\n",$ofile);
# output all the assembler text
foreach $bf (@outlist) {
# ordinary line
unless (ref($bf)) {
print($xfdst $bf,"\n");
next;
}
# apply a fixup
foreach $rhs (@$bf) {
print($xfdst $rhs,"\n");
}
}
# output all our reasoning as comments at the bottom
foreach $bf (@cmtprt) {
if ($bf eq "") {
print($xfdst $cmtchr,$bf,"\n");
}
else {
print($xfdst $cmtchr," ",$bf,"\n");
}
}
close($xfdst);
# get difference
if (defined($opt_D)) {
system("diff -u $file $ofile >> $opt_D");
}
# install fixed/modified file
{
last unless ($opt_o || defined($opt_O));
last if ($fatal);
msgprt("$pgmtail: installing ...\n");
rename($ofile,$file);
}
}
# regtmpall -- define all temporary register candidates
sub regtmpall
{
dbgprt(1,"regtmpall: ENTER\n");
regtmpdef("%r11","T");
# NOTES:
# (1) see notes on %r10 in ABI at bottom -- should we use it?
# (2) a web search on "shared chain" and "x86" only produces 28 results
# (3) some gcc code uses it as an ordinary register
# (4) so, use it unless told not to
regtmpdef("%r10","T")
unless ($opt_n10);
# argument registers (a6-a1)
regtmpdef("%r9","A6");
regtmpdef("%r8","A5");
regtmpdef("%rcx","A4");
regtmpdef("%rdx","A3");
regtmpdef("%rsi","A2");
regtmpdef("%rdi","A1");
# callee preserved registers
regtmpdef("%r15","P");
regtmpdef("%r14","P");
regtmpdef("%r13","P");
regtmpdef("%r12","P");
dbgprt(1,"regtmpall: EXIT\n");
}
# regtmpdef -- define usable temp registers
sub regtmpdef
{
my($sym,$typ) = @_;
dbgprt(1,"regtmpdef: SYM sym='%s' typ='%s'\n",$sym,$typ);
push(@regtmplist,$sym);
$regtmp_type{$sym} = $typ;
}
# regtmploc -- locate temp register to fix problem
sub regtmploc
{
my($fnx,$fixlist) = @_;
my($sixlist);
my($uselook);
my($regrhs);
my($fixcnt);
my($coretyp);
my($reglhs,$regtyp);
dbgprt(2,"regtmploc: ENTER fnx_fnc='%s'\n",$fnx->{fnx_fnc});
$sixlist = $fnx->{fnx_sixlist};
$fixcnt = @$fixlist;
$fixcnt = 1
if ($opt_P);
$uselook = $fnx->{fnx_used};
foreach $regrhs (@regtmplist) {
dbgprt(2,"regtmploc: TRYREG regrhs='%s' uselook=%d\n",
$regrhs,$uselook->{$regrhs});
unless ($uselook->{$regrhs}) {
$regtyp = $regtmp_type{$regrhs};
$coretyp = $regtyp;
$coretyp =~ s/\d+$//;
# function uses stack arguments -- we can't push/pop
if (($coretyp eq "P") && (@$sixlist > 0)) {
dbgprt(2,"regtmploc: SIXREJ\n");
next;
}
if (defined($opt_N)) {
dbgprt(2,"regtmploc: TRYREJ opt_N='%s' regtyp='%s'\n",
$opt_N,$regtyp);
next if ($opt_N =~ /$coretyp/);
}
$reglhs = $regrhs;
last;
}
}
{
last if (defined($reglhs));
errprt("regtmploc: unable to locate usable fixup register for function '%s'\n",
$fnx->{fnx_fnc});
last if ($fixcnt <= 0);
$fatal = 1;
}
dbgprt(2,"regtmploc: EXIT reglhs='%s' regtyp='%s'\n",$reglhs,$regtyp);
($reglhs,$regtyp);
}
# regusejoin -- get regex for all registers
sub regusejoin
{
my($reg);
dbgprt(1,"regusejoin: ENTER\n");
# rax
foreach $reg (qw(a b c d)) {
regusedef($reg,"r_x","e_x","_l","_h");
}
# rdi/rsi
foreach $reg (qw(d s)) {
regusedef($reg,"r_i","e_i","_i","_il");
}
# rsp/rbp
foreach $reg (qw(b s)) {
regusedef($reg,"r_p","e_p");
}
foreach $reg (8,9,10,11,12,13,14,15) {
regusedef($reg,"r_","r_d","r_w","r_b");
}
$regusergx = join("|",reverse(sort(@reguse)));
dbgprt(1,"regusejoin: EXIT regusergx='%s'\n",$regusergx);
}
# regusedef -- define all registers
sub regusedef
{
my(@argv) = @_;
my($mid);
my($pat);
my($base);
$mid = shift(@argv);
dbgprt(1,"regusedef: ENTER mid='%s'\n",$mid);
foreach $pat (@argv) {
$pat = "%" . $pat;
$pat =~ s/_/$mid/;
$base //= $pat;
dbgprt(1,"regusedef: PAT pat='%s' base='%s'\n",$pat,$base);
push(@reguse,$pat);
$reguse_tobase{$pat} = $base;
}
$reguse_isbase{$base} = 1;
dbgprt(1,"regusedef: EXIT\n");
}
# regusesort -- sort base register names
sub regusesort
{
my($symlhs,$numlhs);
my($symrhs,$numrhs);
my($cmpflg);
{
($symlhs,$numlhs) = _regusesort($a);
($symrhs,$numrhs) = _regusesort($b);
$cmpflg = $symlhs cmp $symrhs;
last if ($cmpflg);
$cmpflg = $numlhs <=> $numrhs;
}
$cmpflg;
}
# _regusesort -- split up base register name
sub _regusesort
{
my($sym) = @_;
my($num);
if ($sym =~ s/(\d+)$//) {
$num = ;
$num += 0;
$sym =~ s/[^%]/z/g;
}
($sym,$num);
}
# optget -- get options
sub optget
{
my($argv) = @_;
my($bf);
my($sym,$val);
my($dft,%dft);
foreach $sym (qw(a l n10 P q o s T)) {
$dft{$sym} = 1;
}
$dft{"N"} = "T";
$dft{"D"} = "DIFF";
while (1) {
$bf = $argv->[0];
$sym = $bf;
last unless ($sym =~ s/^-//);
last if ($sym eq "-");
shift(@$argv);
{
if ($sym =~ /([^=]+)=(.+)$/) {
($sym,$val) = (,);
last;
}
if ($sym =~ /^(.)(.+)$/) {
($sym,$val) = (,);
last;
}
undef($val);
}
$dft = $dft{$sym};
sysfault("$pgmtail: unknown option -- '%s'\n",$bf)
unless (defined($dft));
$val //= $dft;
${"opt_" . $sym} = $val;
}
}
# cmtprt -- transformation comments
sub cmtprt
{
$_ = shift(@_);
$_ = sprintf($_,@_);
chomp($_);
push(@cmtprt,$_);
}
# msgprt -- progress output
sub msgprt
{
printf(STDERR @_);
}
# errprt -- show errors
sub errprt
{
cmtprt(@_);
printf(STDERR @_);
}
# sysfault -- abort on error
sub sysfault
{
printf(STDERR @_);
exit(1);
}
# dbgprt -- debug print
sub dbgprt
{
$_ = shift(@_);
goto &_dbgprt
if ($opt_T >= $_);
}
# _dbgprt -- debug print
sub _dbgprt
{
printf(STDERR @_);
}
更新:
我更新了脚本以修复错误、添加更多检查和更多选项。 注意:我不得不删除底部的 ABI 以适应 30,000 的限制。
Otherwise weird results appear on other commands with parentheses for example cmpl %ebp, (%rax,%r14)
splits into lhs='cmpl %ebp, (%rax'
and rhs='%r14)'
which in turn causes /$rhs/
to fail.
是的,这是一个错误。固定。
Your $rhs =~ /%[er](.x|\d+)/
doesn't match byte or word loads to di
, or ax
. That's unlikely, though. Oh, also, I think it fails to match rdi / rsi
. so you don't need the trailing d in r10d
已修复。查找所有变体。
Wow, I assumed something like this would have to happen at compile time, and that doing it after the fact would be too messy.
无耻外挂:感谢“哇!”。 perl
非常适合像这样的杂乱工作。我以前写过这样的汇编程序“注入”脚本。 (例如)回到[在编译器支持之前]添加分析调用的日子。
You could mark %r10 as another call-preserved register.
经过几次网络搜索后,我在 "static chain" x86
上只能找到大约 84 个匹配项。唯一相关的是 x86 ABI。而且,除了作为脚注提及之外,它没有提供任何解释。此外,一些 gcc
代码使用 r10
而没有 任何保存为 callee 寄存器。所以,我现在默认程序使用 r10
(如果需要,可以使用命令行选项禁用它)。
What happens if a function already uses all the registers?
如果确实全部,那么我们就不走运了。如果找不到备用寄存器,该脚本将检测并报告此情况并禁止修正。
并且,它 将 通过注入 push
作为函数的第一个实例和相应的 pop
使用“被调用者必须保留”寄存器就在 ret
inst [可以有多个] 之前。这可以通过一个选项禁用。
You can't just push/pop, because that steps on the red-zone
不,不。有几个原因:
(1) 几乎作为旁注:红色区域仅在叶函数中有用。否则,如果 fncA
调用 fncB
,仅 fncA
这样做的行为就会踩到它自己的红色区域。请参阅脚本顶部注释块中的编译选项。
(2) 更重要的是,因为 push/pop
的注入方式。 push
发生在 之前 任何其他 insts。 pop
发生在 之后 任何其他实例 [就在 ret
] 之前
红色区域仍然存在——完好无损。它只是从原本应该的位置偏移了 -8。所有红色区域 activity 都被保留,因为这些 insts 使用来自 %rsp
的 negative 偏移量
这与在内联 asm 块中完成的 push/pop
不同。通常的情况是红色区域代码在做(例如)mov ,-4(%rsp)
。随后出现的内联 asm 块执行 push/pop
会 踩到它。
显示这个的一些函数:
# function_original -- original function before pebsfixup
# RETURNS: 23
function_original:
mov ,-4(%rsp) # red zone code generated by compiler
...
mov -4(%rsp),%rax # will still have
ret
# function_pebsfixup -- pebsfixup modified
# RETURNS: 23
function_pebsfixup:
push %r12 # pebsfixup injected
mov ,-4(%rsp) # red zone code generated by compiler
...
mov -4(%rsp),%rax # will still have
pop %r12 # pebsfixup injected
ret
# function_inline -- function with inline asm block and red zone
# RETURNS: unknown value
function_inline:
mov ,-4(%rsp) # red zone code generated by compiler
# inline asm block -- steps on red zone
push %rdx
push %rcx
...
pop %rcx
pop %rdx
...
mov -4(%rsp),%rax # now -4(%rsp) no longer has
ret
push/pop
确实给我们带来麻烦的地方是函数使用多于的六个参数(即args 7 + 在堆栈上)。访问这些参数使用 positive 来自 %rsp
:
的偏移量
mov 32(%rsp),%rax
使用我们的“技巧”push
,偏移量将不正确。正确的偏移量现在高 8:
mov 40(%rsp),%rax
脚本将检测并抱怨。但是,它 [yet] 不会调整正偏移量,因为这种情况的可能性很低。解决这个问题可能还需要大约五行代码。现在打球...
我想使用基于事件的精确采样 (PEBS) 在 XeonE5 Sandy Bridge 上记录特定事件的所有地址(例如缓存未命中)。
但是,CoreTM i7 处理器和 Intel® XeonTM 5500 处理器 的性能分析指南第 24 页包含以下内容警告:
As the PEBS mechanism captures the values of the register at completion of the instruction, the dereferenced address for the following type of load instruction (Intel asm convention) cannot be reconstructed.
MOV RAX, [RAX+const]
This kind of instruction is mostly associated with pointer chasing
mystruc = mystruc->next;
This is a significant shortcoming of this approach to capturing memory instruction addresses.
根据 objdump,我的程序中有许多这种形式的加载指令。 有什么办法可以避免这些吗?
由于这是一个特定于英特尔的问题,解决方案不必以任何方式移植,它只需要工作。我的代码是用 C 编写的,理想情况下我正在寻找编译器级别的解决方案(gcc 或 icc),但欢迎提出任何建议。
一些例子:
mov 0x18(%rdi),%rdi
mov (%rcx,%rax,8),%rax
在这两种情况下,在指令退出后(因此当我查看寄存器值以确定我加载的位置时 to/from)地址的值(分别为 %rdi + 18
和 %rcx + 8 * %rax
在这些示例中)被 mov
.
我现在能想到的唯一方法是使用 &(& 符号)汇编器约束。这意味着我将不得不在出现此类指令的任何地方检查我的代码,并将每个取消引用 mystruc = mystruc->next;
的指针替换为:
asm volatile("mov (%1),%0" : "=&r" (mystruc) : "r" (&(mystruc->next)))
然而,这是一种非常繁琐的方法,并且在某些情况下可能比结构中的指针更复杂。我知道这基本上是在增加寄存器压力,因此编译器正在积极尝试避免这种情况。还有其他方法吗?
您想做的是转换所有形式的指令:
mov (%rcx,%rax,8),%rax
进入:
mov (%rcx,%rax,8),%r11
mov %r11,%rax
这可以通过修改编译器生成的汇编源代码更容易地完成。下面是一个 perl
脚本,它将通过读取和修改 .s
文件来完成所有必要的转换。
只需更改构建以生成 .s
文件而不是 .o
文件,应用脚本,然后使用 as
或 [=22] 生成 .o
=]
这是实际的脚本。我已经按照下面评论中的构建过程在我自己的一些来源上对其进行了测试。
该脚本具有以下特点:
- 扫描并定位所有函数定义
- 识别给定函数中使用的所有寄存器
- 找到函数的所有return点
- 根据函数的寄存器使用情况选择要使用的临时寄存器(即它将使用未已被函数使用的临时寄存器)
- 用两个指令序列替换所有“麻烦的”指令
- 在尝试使用被调用者保存的寄存器 之前尝试使用未使用的临时寄存器(例如
- 如果选择的寄存器被callee保存,将添加
push
到函数序言和pop
到函数[多个]ret
语句 - 维护所有分析和转换的日志并将其作为注释附加到输出
.s
文件
%r11
或未使用的参数寄存器)
#!/usr/bin/perl
# pebsfix/pebsfixup -- fix assembler source for PEBS usage
#
# command line options:
# "-a" -- use only full 64 bit targets
# "-l" -- do _not_ use lea
# "-D[diff-file]" -- show differences (default output: "./DIFF")
# "-n10" -- do _not_ use register %r10 for temporary (default is use it)
# "-o" -- overwrite input files (can be multiple)
# "-O<outfile>" -- output file (only one .s input allowed)
# "-q" -- suppress warnings
# "-T[lvl]" -- debug trace
#
# "-o" and "-O" are mutually exclusive
#
# command line script test options:
# "-N[TPA]" -- disable temp register types [for testing]
# "-P" -- force push/pop on all functions
#
# command line arguments:
# 1-- list of .s files to process [or directory to search]
# for a given file "foo.s", output is to "foo.TMP"
# if (-o is given, "foo.TMP" is renamed to "foo.s")
#
# suggested usage:
# change build to produce .s files
# FROM:
# cc [options] -c foo.c
# TO:
# cc [options] -c -S foo.c
# pebsfixup -o foo.s
# cc -c foo.s
#
# suggested compiler options:
# [probably only really needed if push/pop required. use -NP to verify]
# (1) use either of
# -O2 -fno-optimize-sibling-calls
# -O1
# (2) use -mno-omit-leaf-frame-pointer
# (3) use -mno-red-zone [probably not required in any case]
#
# NOTES:
# (1) red zones are only really useful for leaf functions (i.e. if fncA calls
# fncB, fncA's red zone would be clobbered)
# (2) pushing onto the stack isn't a problem if there is a formal stack frame
# (3) the push is okay if the function has no more than six arguments (i.e.
# does _not_ use positive offsets from %rsp to access them)
#pragma pgmlns
use strict qw(vars subs);
our $pgmtail;
our $opt_a;
our $opt_T;
our $opt_D;
our $opt_l;
our $opt_n10;
our $opt_N;
our $opt_P;
our $opt_q;
our $opt_o;
our $opt_O;
our $opt_s;
our @reguse;
our %reguse_tobase;
our %reguse_isbase;
our $regusergx;
our @regtmplist;
our %regtmp_type;
our $diff;
our $sepflg;
our $fatal;
our @cmtprt;
master(@ARGV);
exit(0);
# master -- master control
sub master
{
my(@argv) = @_;
my($xfsrc);
my($file,@files);
my($bf);
$pgmtail = "pebsfixup";
optget(\@argv);
# define all known/usable registers
regusejoin();
# define all registers that we may use as a temporary
regtmpall();
if (defined($opt_D)) {
unlink($opt_D);
}
# show usage
if (@argv <= 0) {
$file = [=12=];
open($xfsrc,"<$file") ||
sysfault("$pgmtail: unable to open '%s' -- $!\n",$file);
while ($bf = <$xfsrc>) {
chomp($bf);
next if ($bf =~ /^#!/);
last unless ($bf =~ s/^#//);
$bf =~ s/^# ?//;
print($bf,"\n");
}
close($xfsrc);
exit(1);
}
foreach $file (@argv) {
if (-d $file) {
dodir(\@files,$file);
}
else {
push(@files,$file);
}
}
if (defined($opt_O)) {
sysfault("$pgmtail: -O may have only one input file\n")
if (@files != 1);
sysfault("$pgmtail: -O and -o are mutually exclusive\n")
if ($opt_o);
}
foreach $file (@files) {
dofile($file);
}
if (defined($opt_D)) {
exec("less",$opt_D);
}
}
# dodir -- process directory
sub dodir
{
my($files,$dir) = @_;
my($file,@files);
@files = (`find $dir -type f -name '*.s'`);
foreach $file (@files) {
chomp($file);
push(@$files,$file);
}
}
# dofile -- process file
sub dofile
{
my($file) = @_;
my($ofile);
my($xfsrc);
my($xfdst);
my($bf,$lno,$outoff);
my($fixoff);
my($lhs,$rhs);
my($xop,$arg);
my($ix);
my($sym,$val,$typ);
my(%sym_type);
my($fnc,$fnx,%fnx_lookup,@fnxlist);
my($retlist);
my($uselook,@uselist,%avail);
my($fixreg,$fixrtyp);
my($sixlist);
my($fix,$fixlist);
my($fixtot);
my(@fix);
my(@outlist);
my($relaxflg);
my($cmtchr);
undef($fatal);
undef(@cmtprt);
msgprt("\n")
if ($sepflg);
$sepflg = 1;
msgprt("$pgmtail: processing %s ...\n",$file);
$cmtchr = "#";
cmtprt("%s\n","-" x 78);
cmtprt("FILE: %s\n",$file);
# get the output file
$ofile = $file;
sysfault("$pgmtail: bad suffix -- file='%s'\n",$file)
unless ($ofile =~ s/[.]s$//);
$ofile .= ".TMP";
# use explicit output file
if (defined($opt_O)) {
$ofile = $opt_O;
sysfault("$pgmtail: output file may not be input file -- use -o instead\n")
if ($ofile eq $file);
}
open($xfsrc,"<$file") ||
sysfault("$pgmtail: unable to open '%s' -- $!\n",$file);
$lno = 0;
while ($bf = <$xfsrc>) {
chomp($bf);
$bf =~ s/\s+$//;
$outoff = $lno;
++$lno;
push(@outlist,$bf);
# clang adds comments
$ix = index($bf,"#");
if ($ix >= 0) {
$bf = substr($bf,0,$ix);
$bf =~ s/\s+$//;
}
# look for ".type blah, @function"
# NOTE: this always comes before the actual label line [we hope ;-)]
if ($bf =~ /^\s+[.]type\s+([^,]+),\s*(\S+)/) {
($sym,$val) = (,);
$val =~ s/^\@//;
$sym_type{$sym} = $val;
cmtprt("\n");
cmtprt("TYPE: %s --> %s\n",$sym,$val);
next;
}
# look for "label:"
if ($bf =~ /^([a-z_A-Z][a-z_A-Z0-9]*):$/) {
$sym = ;
next if ($sym_type{$sym} ne "function");
$fnc = $sym;
cmtprt("FUNCTION: %s\n",$fnc);
$fnx = {};
$fnx_lookup{$sym} = $fnx;
push(@fnxlist,$fnx);
$fnx->{fnx_fnc} = $fnc;
$fnx->{fnx_outoff} = $outoff;
$uselook = {};
$fnx->{fnx_used} = $uselook;
$retlist = [];
$fnx->{fnx_retlist} = $retlist;
$fixlist = [];
$fnx->{fnx_fixlist} = $fixlist;
$sixlist = [];
$fnx->{fnx_sixlist} = $sixlist;
next;
}
# remember all registers used by function:
while ($bf =~ /($regusergx)/gpo) {
$sym = ${^MATCH};
$val = $reguse_tobase{$sym};
dbgprt(3,"dofile: REGUSE sym='%s' val='%s'\n",$sym,$val);
$uselook->{$sym} += 1;
$uselook->{$val} += 1
if ($val ne $sym);
}
# handle returns
if ($bf =~ /^\s+ret/) {
push(@$retlist,$outoff);
next;
}
if ($bf =~ /^\s+rep[a-z]*\s+ret/) {
push(@$retlist,$outoff);
next;
}
# split up "movq 16(%rax),%rax" ...
$ix = rindex($bf,",");
next if ($ix < 0);
# ... into "movq 16(%rax)"
$lhs = substr($bf,0,$ix);
$lhs =~ s/\s+$//;
# check for "movq 16(%rsp)" -- this means that the function has/uses
# more than six arguments (i.e. we may _not_ push/pop because it
# wreaks havoc with positive offsets)
# FIXME/CAE -- we'd have to adjust them by 8 which we don't do
(undef,$rhs) = split(" ",$lhs);
if ($rhs =~ /^(\d+)[(]%rsp[)]$/) {
push(@$sixlist,$outoff);
cmtprt("SIXARG: %s (line %d)\n",$rhs,$lno);
}
# ... and "%rax"
$rhs = substr($bf,$ix + 1);
$rhs =~ s/^\s+//;
# target must be a [simple] register [or source scan will blow up]
# (e.g. we actually had "cmp %ebp,(%rax,%r14)")
next if ($rhs =~ /[)]/);
# ensure we have the "%" prefix
next unless ($rhs =~ /^%/);
# we only want the full 64 bit reg as target
# (e.g. "mov (%rbx),%al" doesn't count)
$val = $reguse_tobase{$rhs};
if ($opt_a) {
next if ($val ne $rhs);
}
else {
next unless (defined($val));
}
# source operand must contain target [base] register
next unless ($lhs =~ /$val/);
###cmtprt("1: %s,%s\n",$lhs,$rhs);
# source operand must be of the "right" type
# FIXME/CAE -- we may need to revise this
next unless ($lhs =~ /[(]/);
cmtprt("NEEDFIX: %s,%s (line %d)\n",$lhs,$rhs,$lno);
# remember the place we need to fix for later
$fix = {};
push(@$fixlist,$fix);
$fix->{fix_outoff} = $outoff;
$fix->{fix_lhs} = $lhs;
$fix->{fix_rhs} = $rhs;
}
close($xfsrc);
# get total number of fixups
foreach $fnx (@fnxlist) {
$fixlist = $fnx->{fnx_fixlist};
$fixtot += @$fixlist;
}
msgprt("$pgmtail: needs %d fixups\n",$fixtot)
if ($fixtot > 0);
# fix each function
foreach $fnx (@fnxlist) {
cmtprt("\n");
cmtprt("FNC: %s\n",$fnx->{fnx_fnc});
$fixlist = $fnx->{fnx_fixlist};
# get the fixup register
($fixreg,$fixrtyp) = regtmploc($fnx,$fixlist);
# show number of return points
{
$retlist = $fnx->{fnx_retlist};
cmtprt(" RET: %d\n",scalar(@$retlist));
last if (@$retlist >= 1);
# NOTE: we display this warning because we may not be able to
# handle all situations
$relaxflg = (@$fixlist <= 0) || ($fixrtyp ne "P");
last if ($relaxflg && $opt_q);
errprt("$pgmtail: in file '%s'\n",$file);
errprt("$pgmtail: function '%s' has no return points\n",
$fnx->{fnx_fnc});
errprt("$pgmtail: suggest recompile with correct options\n");
if (@$fixlist <= 0) {
errprt("$pgmtail: working around because function needs no fixups\n");
last;
}
if ($fixrtyp ne "P") {
errprt("$pgmtail: working around because fixup reg does not need to be saved\n");
last;
}
}
# show stats on register usage in function
$uselook = $fnx->{fnx_used};
@uselist = sort(keys(%$uselook));
cmtprt(" USED:\n");
%avail = %reguse_isbase;
foreach $sym (@uselist) {
$val = $uselook->{$sym};
$typ = $regtmp_type{$sym};
$typ = sprintf(" (TYPE: %s)",$typ)
if (defined($typ));
cmtprt(" %s used %d%s\n",$sym,$val,$typ);
$val = $reguse_tobase{$sym};
delete($avail{$val});
}
# show function's available [unused] registers
@uselist = keys(%avail);
@uselist = sort(regusesort @uselist);
if (@uselist > 0) {
cmtprt(" AVAIL:\n");
foreach $sym (@uselist) {
$typ = $regtmp_type{$sym};
$typ = sprintf(" (TYPE: %s)",$typ)
if (defined($typ));
cmtprt(" %s%s\n",$sym,$typ);
}
}
# skip over any functions that don't need fixing _and_ have a temp
# register
if (@$fixlist <= 0 && (! $opt_P)) {
next if (defined($fixreg));
}
msgprt("$pgmtail: function %s\n",$fnx->{fnx_fnc});
# skip function because we don't have a fixup register but report it
# here
unless (defined($fixreg)) {
$bf = (@$fixlist > 0) ? "FATAL" : "can be ignored -- no fixups needed";
msgprt("$pgmtail: FIXNOREG (%s)\n",$bf);
cmtprt(" FIXNOREG (%s)\n",$bf);
next;
}
msgprt("$pgmtail: FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp);
cmtprt(" FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp);
foreach $fix (@$fixlist) {
$outoff = $fix->{fix_outoff};
undef(@fix);
cmtprt(" FIXOLD %s\n",$outlist[$outoff]);
# original
if ($opt_l) {
$bf = sprintf("%s,%s",$fix->{fix_lhs},$fixreg);
push(@fix,$bf);
$bf = sprintf("\tmov\t%s,%s",$fixreg,$fix->{fix_rhs});
push(@fix,$bf);
}
# use lea
else {
($xop,$arg) = split(" ",$fix->{fix_lhs});
$bf = sprintf("\tlea\t\t%s,%s",$arg,$fixreg);
push(@fix,$bf);
$bf = sprintf("\t%s\t(%s),%s",$xop,$fixreg,$fix->{fix_rhs});
push(@fix,$bf);
}
foreach $bf (@fix) {
cmtprt(" FIXNEW %s\n",$bf);
}
$outlist[$outoff] = [@fix];
}
unless ($opt_P) {
next if ($fixrtyp ne "P");
}
# fix the function prolog
$outoff = $fnx->{fnx_outoff};
$lhs = $outlist[$outoff];
$rhs = sprintf("\tpush\t%s",$fixreg);
$bf = [$lhs,$rhs,""];
$outlist[$outoff] = $bf;
# fix the function return points
$retlist = $fnx->{fnx_retlist};
foreach $outoff (@$retlist) {
$rhs = $outlist[$outoff];
$lhs = sprintf("\tpop\t%s",$fixreg);
$bf = ["",$lhs,$rhs];
$outlist[$outoff] = $bf;
}
}
open($xfdst,">$ofile") ||
sysfault("$pgmtail: unable to open '%s' -- $!\n",$ofile);
# output all the assembler text
foreach $bf (@outlist) {
# ordinary line
unless (ref($bf)) {
print($xfdst $bf,"\n");
next;
}
# apply a fixup
foreach $rhs (@$bf) {
print($xfdst $rhs,"\n");
}
}
# output all our reasoning as comments at the bottom
foreach $bf (@cmtprt) {
if ($bf eq "") {
print($xfdst $cmtchr,$bf,"\n");
}
else {
print($xfdst $cmtchr," ",$bf,"\n");
}
}
close($xfdst);
# get difference
if (defined($opt_D)) {
system("diff -u $file $ofile >> $opt_D");
}
# install fixed/modified file
{
last unless ($opt_o || defined($opt_O));
last if ($fatal);
msgprt("$pgmtail: installing ...\n");
rename($ofile,$file);
}
}
# regtmpall -- define all temporary register candidates
sub regtmpall
{
dbgprt(1,"regtmpall: ENTER\n");
regtmpdef("%r11","T");
# NOTES:
# (1) see notes on %r10 in ABI at bottom -- should we use it?
# (2) a web search on "shared chain" and "x86" only produces 28 results
# (3) some gcc code uses it as an ordinary register
# (4) so, use it unless told not to
regtmpdef("%r10","T")
unless ($opt_n10);
# argument registers (a6-a1)
regtmpdef("%r9","A6");
regtmpdef("%r8","A5");
regtmpdef("%rcx","A4");
regtmpdef("%rdx","A3");
regtmpdef("%rsi","A2");
regtmpdef("%rdi","A1");
# callee preserved registers
regtmpdef("%r15","P");
regtmpdef("%r14","P");
regtmpdef("%r13","P");
regtmpdef("%r12","P");
dbgprt(1,"regtmpall: EXIT\n");
}
# regtmpdef -- define usable temp registers
sub regtmpdef
{
my($sym,$typ) = @_;
dbgprt(1,"regtmpdef: SYM sym='%s' typ='%s'\n",$sym,$typ);
push(@regtmplist,$sym);
$regtmp_type{$sym} = $typ;
}
# regtmploc -- locate temp register to fix problem
sub regtmploc
{
my($fnx,$fixlist) = @_;
my($sixlist);
my($uselook);
my($regrhs);
my($fixcnt);
my($coretyp);
my($reglhs,$regtyp);
dbgprt(2,"regtmploc: ENTER fnx_fnc='%s'\n",$fnx->{fnx_fnc});
$sixlist = $fnx->{fnx_sixlist};
$fixcnt = @$fixlist;
$fixcnt = 1
if ($opt_P);
$uselook = $fnx->{fnx_used};
foreach $regrhs (@regtmplist) {
dbgprt(2,"regtmploc: TRYREG regrhs='%s' uselook=%d\n",
$regrhs,$uselook->{$regrhs});
unless ($uselook->{$regrhs}) {
$regtyp = $regtmp_type{$regrhs};
$coretyp = $regtyp;
$coretyp =~ s/\d+$//;
# function uses stack arguments -- we can't push/pop
if (($coretyp eq "P") && (@$sixlist > 0)) {
dbgprt(2,"regtmploc: SIXREJ\n");
next;
}
if (defined($opt_N)) {
dbgprt(2,"regtmploc: TRYREJ opt_N='%s' regtyp='%s'\n",
$opt_N,$regtyp);
next if ($opt_N =~ /$coretyp/);
}
$reglhs = $regrhs;
last;
}
}
{
last if (defined($reglhs));
errprt("regtmploc: unable to locate usable fixup register for function '%s'\n",
$fnx->{fnx_fnc});
last if ($fixcnt <= 0);
$fatal = 1;
}
dbgprt(2,"regtmploc: EXIT reglhs='%s' regtyp='%s'\n",$reglhs,$regtyp);
($reglhs,$regtyp);
}
# regusejoin -- get regex for all registers
sub regusejoin
{
my($reg);
dbgprt(1,"regusejoin: ENTER\n");
# rax
foreach $reg (qw(a b c d)) {
regusedef($reg,"r_x","e_x","_l","_h");
}
# rdi/rsi
foreach $reg (qw(d s)) {
regusedef($reg,"r_i","e_i","_i","_il");
}
# rsp/rbp
foreach $reg (qw(b s)) {
regusedef($reg,"r_p","e_p");
}
foreach $reg (8,9,10,11,12,13,14,15) {
regusedef($reg,"r_","r_d","r_w","r_b");
}
$regusergx = join("|",reverse(sort(@reguse)));
dbgprt(1,"regusejoin: EXIT regusergx='%s'\n",$regusergx);
}
# regusedef -- define all registers
sub regusedef
{
my(@argv) = @_;
my($mid);
my($pat);
my($base);
$mid = shift(@argv);
dbgprt(1,"regusedef: ENTER mid='%s'\n",$mid);
foreach $pat (@argv) {
$pat = "%" . $pat;
$pat =~ s/_/$mid/;
$base //= $pat;
dbgprt(1,"regusedef: PAT pat='%s' base='%s'\n",$pat,$base);
push(@reguse,$pat);
$reguse_tobase{$pat} = $base;
}
$reguse_isbase{$base} = 1;
dbgprt(1,"regusedef: EXIT\n");
}
# regusesort -- sort base register names
sub regusesort
{
my($symlhs,$numlhs);
my($symrhs,$numrhs);
my($cmpflg);
{
($symlhs,$numlhs) = _regusesort($a);
($symrhs,$numrhs) = _regusesort($b);
$cmpflg = $symlhs cmp $symrhs;
last if ($cmpflg);
$cmpflg = $numlhs <=> $numrhs;
}
$cmpflg;
}
# _regusesort -- split up base register name
sub _regusesort
{
my($sym) = @_;
my($num);
if ($sym =~ s/(\d+)$//) {
$num = ;
$num += 0;
$sym =~ s/[^%]/z/g;
}
($sym,$num);
}
# optget -- get options
sub optget
{
my($argv) = @_;
my($bf);
my($sym,$val);
my($dft,%dft);
foreach $sym (qw(a l n10 P q o s T)) {
$dft{$sym} = 1;
}
$dft{"N"} = "T";
$dft{"D"} = "DIFF";
while (1) {
$bf = $argv->[0];
$sym = $bf;
last unless ($sym =~ s/^-//);
last if ($sym eq "-");
shift(@$argv);
{
if ($sym =~ /([^=]+)=(.+)$/) {
($sym,$val) = (,);
last;
}
if ($sym =~ /^(.)(.+)$/) {
($sym,$val) = (,);
last;
}
undef($val);
}
$dft = $dft{$sym};
sysfault("$pgmtail: unknown option -- '%s'\n",$bf)
unless (defined($dft));
$val //= $dft;
${"opt_" . $sym} = $val;
}
}
# cmtprt -- transformation comments
sub cmtprt
{
$_ = shift(@_);
$_ = sprintf($_,@_);
chomp($_);
push(@cmtprt,$_);
}
# msgprt -- progress output
sub msgprt
{
printf(STDERR @_);
}
# errprt -- show errors
sub errprt
{
cmtprt(@_);
printf(STDERR @_);
}
# sysfault -- abort on error
sub sysfault
{
printf(STDERR @_);
exit(1);
}
# dbgprt -- debug print
sub dbgprt
{
$_ = shift(@_);
goto &_dbgprt
if ($opt_T >= $_);
}
# _dbgprt -- debug print
sub _dbgprt
{
printf(STDERR @_);
}
更新:
我更新了脚本以修复错误、添加更多检查和更多选项。 注意:我不得不删除底部的 ABI 以适应 30,000 的限制。
Otherwise weird results appear on other commands with parentheses for example
cmpl %ebp, (%rax,%r14)
splits intolhs='cmpl %ebp, (%rax'
andrhs='%r14)'
which in turn causes/$rhs/
to fail.
是的,这是一个错误。固定。
Your
$rhs =~ /%[er](.x|\d+)/
doesn't match byte or word loads todi
, orax
. That's unlikely, though. Oh, also, I think it fails to matchrdi / rsi
. so you don't need the trailing d in r10d
已修复。查找所有变体。
Wow, I assumed something like this would have to happen at compile time, and that doing it after the fact would be too messy.
无耻外挂:感谢“哇!”。 perl
非常适合像这样的杂乱工作。我以前写过这样的汇编程序“注入”脚本。 (例如)回到[在编译器支持之前]添加分析调用的日子。
You could mark %r10 as another call-preserved register.
经过几次网络搜索后,我在 "static chain" x86
上只能找到大约 84 个匹配项。唯一相关的是 x86 ABI。而且,除了作为脚注提及之外,它没有提供任何解释。此外,一些 gcc
代码使用 r10
而没有 任何保存为 callee 寄存器。所以,我现在默认程序使用 r10
(如果需要,可以使用命令行选项禁用它)。
What happens if a function already uses all the registers?
如果确实全部,那么我们就不走运了。如果找不到备用寄存器,该脚本将检测并报告此情况并禁止修正。
并且,它 将 通过注入 push
作为函数的第一个实例和相应的 pop
使用“被调用者必须保留”寄存器就在 ret
inst [可以有多个] 之前。这可以通过一个选项禁用。
You can't just push/pop, because that steps on the red-zone
不,不。有几个原因:
(1) 几乎作为旁注:红色区域仅在叶函数中有用。否则,如果 fncA
调用 fncB
,仅 fncA
这样做的行为就会踩到它自己的红色区域。请参阅脚本顶部注释块中的编译选项。
(2) 更重要的是,因为 push/pop
的注入方式。 push
发生在 之前 任何其他 insts。 pop
发生在 之后 任何其他实例 [就在 ret
] 之前
红色区域仍然存在——完好无损。它只是从原本应该的位置偏移了 -8。所有红色区域 activity 都被保留,因为这些 insts 使用来自 %rsp
这与在内联 asm 块中完成的 push/pop
不同。通常的情况是红色区域代码在做(例如)mov ,-4(%rsp)
。随后出现的内联 asm 块执行 push/pop
会 踩到它。
显示这个的一些函数:
# function_original -- original function before pebsfixup
# RETURNS: 23
function_original:
mov ,-4(%rsp) # red zone code generated by compiler
...
mov -4(%rsp),%rax # will still have
ret
# function_pebsfixup -- pebsfixup modified
# RETURNS: 23
function_pebsfixup:
push %r12 # pebsfixup injected
mov ,-4(%rsp) # red zone code generated by compiler
...
mov -4(%rsp),%rax # will still have
pop %r12 # pebsfixup injected
ret
# function_inline -- function with inline asm block and red zone
# RETURNS: unknown value
function_inline:
mov ,-4(%rsp) # red zone code generated by compiler
# inline asm block -- steps on red zone
push %rdx
push %rcx
...
pop %rcx
pop %rdx
...
mov -4(%rsp),%rax # now -4(%rsp) no longer has
ret
push/pop
确实给我们带来麻烦的地方是函数使用多于的六个参数(即args 7 + 在堆栈上)。访问这些参数使用 positive 来自 %rsp
:
mov 32(%rsp),%rax
使用我们的“技巧”push
,偏移量将不正确。正确的偏移量现在高 8:
mov 40(%rsp),%rax
脚本将检测并抱怨。但是,它 [yet] 不会调整正偏移量,因为这种情况的可能性很低。解决这个问题可能还需要大约五行代码。现在打球...