计算 PDB 文件中的距离
Calculating distances in PDB file
参考问题Calculating the distance between atomic coordinates,其中输入为
ATOM 920 CA GLN A 203 39.292 -13.354 17.416 1.00 55.76 C
ATOM 929 CA HIS A 204 38.546 -15.963 14.792 1.00 29.53 C
ATOM 939 CA ASN A 205 39.443 -17.018 11.206 1.00 54.49 C
ATOM 947 CA GLU A 206 41.454 -13.901 10.155 1.00 26.32 C
ATOM 956 CA VAL A 207 43.664 -14.041 13.279 1.00 40.65 C
.
.
.
ATOM 963 CA GLU A 208 45.403 -17.443 13.188 1.00 40.25 C
有an answer报告为
use strict;
use warnings;
my @line;
while (<>) {
push @line, $_; # add line to buffer
next if @line < 2; # skip unless buffer is full
print proc(@line), "\n"; # process and print
shift @line; # remove used line
}
sub proc {
my @a = split ' ', shift; # line 1
my @b = split ' ', shift; # line 2
my $x = ($a[6]-$b[6]); # calculate the diffs
my $y = ($a[7]-$b[7]);
my $z = ($a[8]-$b[8]);
my $dist = sprintf "%.1f", # format the number
sqrt($x**2+$y**2+$z**2); # do the calculation
return "$a[3]-$b[3]\t$dist"; # return the string for printing
}
以上代码的输出是第一个 CA 到第二个 CA 和第二个到第三个等等之间的距离...
如何修改此代码以查找第一个 CA 与其余 CA(2、3、..)以及第二个 CA 与其余 CA(3、4、..)之间的距离等等只打印小于 5 埃的那些?
我发现应该更改 push @line, $_;
语句以增加数组大小,但不清楚该怎么做。
要获取对,请将文件读入数组 @data_array
。然后循环条目。
更新:添加文件打开和加载@data_array。
open my $fh, '<', 'atom_file.pdb' or die $!;
my @data_array = <$fh>;
close $fh or die $!;
for my $i (0 .. $#data_array) {
for my $j ($i+1 .. $#data_array) {
process(@data_array[$i,$j]);
}
}
可以试试这个:
use strict;
use warnings;
my @alllines = ();
while(<DATA>) { push(@alllines, $_); }
#Each Current line
for(my $i=0; $i<=$#alllines+1; $i++)
{
#Each Next line
for(my $j=$i+1; $j<=$#alllines; $j++)
{
if($alllines[$i])
{
#Split the line into tab delimits
my ($line1_tb_1,$line1_tb_2,$line1_tb_3) = split /\t/, $alllines[$i];
print "Main_Line: $line1_tb_1\t$line1_tb_2\t$line1_tb_3";
if($alllines[$j])
{
#Split the line into tab delimits
my ($line_nxt_tb1,$line_nxt_tb2,$line_nxt_tb3) = split /\t/, $alllines[$j];
print "Next_Line: $line_nxt_tb1\t$line_nxt_tb2\t$line_nxt_tb3";
#Do it your coding/regex here
}
}
#system 'pause'; Testing Purpose!!!
}
}
__DATA__
tab1 123 456
tab2 789 012
tab3 345 678
tab4 901 234
tab5 567 890
希望对您有所帮助。
参考问题Calculating the distance between atomic coordinates,其中输入为
ATOM 920 CA GLN A 203 39.292 -13.354 17.416 1.00 55.76 C
ATOM 929 CA HIS A 204 38.546 -15.963 14.792 1.00 29.53 C
ATOM 939 CA ASN A 205 39.443 -17.018 11.206 1.00 54.49 C
ATOM 947 CA GLU A 206 41.454 -13.901 10.155 1.00 26.32 C
ATOM 956 CA VAL A 207 43.664 -14.041 13.279 1.00 40.65 C
.
.
.
ATOM 963 CA GLU A 208 45.403 -17.443 13.188 1.00 40.25 C
有an answer报告为
use strict; use warnings; my @line; while (<>) { push @line, $_; # add line to buffer next if @line < 2; # skip unless buffer is full print proc(@line), "\n"; # process and print shift @line; # remove used line } sub proc { my @a = split ' ', shift; # line 1 my @b = split ' ', shift; # line 2 my $x = ($a[6]-$b[6]); # calculate the diffs my $y = ($a[7]-$b[7]); my $z = ($a[8]-$b[8]); my $dist = sprintf "%.1f", # format the number sqrt($x**2+$y**2+$z**2); # do the calculation return "$a[3]-$b[3]\t$dist"; # return the string for printing }
以上代码的输出是第一个 CA 到第二个 CA 和第二个到第三个等等之间的距离...
如何修改此代码以查找第一个 CA 与其余 CA(2、3、..)以及第二个 CA 与其余 CA(3、4、..)之间的距离等等只打印小于 5 埃的那些?
我发现应该更改 push @line, $_;
语句以增加数组大小,但不清楚该怎么做。
要获取对,请将文件读入数组 @data_array
。然后循环条目。
更新:添加文件打开和加载@data_array。
open my $fh, '<', 'atom_file.pdb' or die $!;
my @data_array = <$fh>;
close $fh or die $!;
for my $i (0 .. $#data_array) {
for my $j ($i+1 .. $#data_array) {
process(@data_array[$i,$j]);
}
}
可以试试这个:
use strict;
use warnings;
my @alllines = ();
while(<DATA>) { push(@alllines, $_); }
#Each Current line
for(my $i=0; $i<=$#alllines+1; $i++)
{
#Each Next line
for(my $j=$i+1; $j<=$#alllines; $j++)
{
if($alllines[$i])
{
#Split the line into tab delimits
my ($line1_tb_1,$line1_tb_2,$line1_tb_3) = split /\t/, $alllines[$i];
print "Main_Line: $line1_tb_1\t$line1_tb_2\t$line1_tb_3";
if($alllines[$j])
{
#Split the line into tab delimits
my ($line_nxt_tb1,$line_nxt_tb2,$line_nxt_tb3) = split /\t/, $alllines[$j];
print "Next_Line: $line_nxt_tb1\t$line_nxt_tb2\t$line_nxt_tb3";
#Do it your coding/regex here
}
}
#system 'pause'; Testing Purpose!!!
}
}
__DATA__
tab1 123 456
tab2 789 012
tab3 345 678
tab4 901 234
tab5 567 890
希望对您有所帮助。