文本文件中计数正值的脚本更正
Script correction for count positive values in text file
前段时间,我请求帮助生成一个 Perl 脚本,该脚本计算文本文件中的值,分为多个部分。
当文本文件的某些行中出现正值时,此脚本会告诉我,然后当开始文本的另一部分时,再次告诉我正值的数量。
例如,这是我的文本文件:
;YP_003858584.1_BtCoVBM48_gp2 25 NKSP 0.1462 (9/9) ---
;YP_003858584.1_BtCoVBM48_gp2 66 NLTW 0.7837 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 116 NTTQ 0.7013 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 126 NGTH 0.7112 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 163 NCTY 0.7620 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 173 NIST 0.6556 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 231 NITY 0.7442 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 273 NGTI 0.7109 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 322 NITQ 0.6116 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 334 NITS 0.7296 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 361 NSSA 0.5388 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 462 NPSG 0.4656 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 541 NSTK 0.5883 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 590 NASS 0.5643 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 603 NCTD 0.7117 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 646 NSSY 0.5467 (4/9) +
;YP_003858584.1_BtCoVBM48_gp2 665 NVSS 0.7980 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 695 NNTI 0.4537 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 703 NFSI 0.5613 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 787 NFSQ 0.6209 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 1060 NFTT 0.4540 (6/9) -
;YP_003858584.1_BtCoVBM48_gp2 1084 NGTH 0.5408 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1120 NNTV 0.5803 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1144 NHTS 0.3828 (8/9) -
;YP_003858584.1_BtCoVBM48_gp2 1149 NVSL 0.4879 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 1159 NASV 0.5021 (3/9) +
;YP_003858584.1_BtCoVBM48_gp2 1180 NESL 0.5770 (7/9) +
;ADK66841.1_NA 25 NKSP 0.1462 (9/9) ---
;ADK66841.1_NA 66 NLTW 0.7837 (9/9) +++
;ADK66841.1_NA 116 NTTQ 0.7013 (9/9) ++
;ADK66841.1_NA 126 NGTH 0.7112 (9/9) ++
;ADK66841.1_NA 163 NCTY 0.7620 (9/9) +++
;ADK66841.1_NA 173 NIST 0.6556 (8/9) +
;ADK66841.1_NA 231 NITY 0.7442 (9/9) ++
;ADK66841.1_NA 273 NGTI 0.7109 (9/9) ++
;ADK66841.1_NA 322 NITQ 0.6116 (8/9) +
;ADK66841.1_NA 334 NITS 0.7296 (9/9) ++
;ADK66841.1_NA 361 NSSA 0.5388 (6/9) +
;ADK66841.1_NA 462 NPSG 0.4656 (5/9) -
;ADK66841.1_NA 541 NSTK 0.5883 (8/9) +
;ADK66841.1_NA 590 NASS 0.5643 (6/9) +
;ADK66841.1_NA 603 NCTD 0.7117 (9/9) ++
;ADK66841.1_NA 646 NSSY 0.5467 (4/9) +
;ADK66841.1_NA 665 NVSS 0.7980 (9/9) +++
;ADK66841.1_NA 695 NNTI 0.4537 (5/9) -
;ADK66841.1_NA 703 NFSI 0.5613 (9/9) ++
;ADK66841.1_NA 787 NFSQ 0.6209 (9/9) ++
;ADK66841.1_NA 1060 NFTT 0.4540 (6/9) -
;ADK66841.1_NA 1084 NGTH 0.5408 (6/9) +
;ADK66841.1_NA 1120 NNTV 0.5803 (6/9) +
;ADK66841.1_NA 1144 NHTS 0.3828 (8/9) -
;ADK66841.1_NA 1149 NVSL 0.4879 (5/9) -
;ADK66841.1_NA 1159 NASV 0.5021 (3/9) +
;ADK66841.1_NA 1180 NESL 0.5770 (7/9) +
此文件在出现正值时向我报告:只有 0.7 >= 是正值。文本文件有两部分:一部分用于 YP_003858584.1_BtCoVBM48_gp2,另一部分用于 ADK66841.1_NA.当你统计每个部分的正值(7>=)个数时,每个部分有 9 个正值。
我有很多这样的文件,有数百个部分,因此,我想知道一个关于 Perl 中的脚本来计算这些值的想法。
这是脚本:
use strict;
use warnings;
my $cnt = {};
while(my $line = <STDIN>) {
if($. == 1) {
next;
}else {
my @cols = split(m{\s+},$line);
if(@cols == 6) {
my $potential = $cols[3];
my $id = $cols[0];
$id =~ s{^\;}{};
if(0.7 >= $potential) {
$cnt->{$id}++;
};
};
};
};
my @ids_found = sort { $a cmp $b } (keys %$cnt);
for my $id (@ids_found) {
print "PART $id:\n";
print "$cnt->{$id} (values 0.7 >=)\n";
};
这工作正常,但是,我注意到输出中有错误。
输出:
$ cat Test00.txt | perl File_for_count_values.pl
PART ADK66841.1_NA:
18 (values 0.7 >=)
PART YP_003858584.1_BtCoVBM48_gp2:
18 (values 0.7 >=)
输出看起来不像我想要的那样,当计算这个脚本的值时加上每个部分的正值 (9 + 9 = 18)。
输出必须是:
$ cat Test00.txt | perl File_for_count_values.pl
PART ADK66841.1_NA:
9 (values 0.7 >=)
PART YP_003858584.1_BtCoVBM48_gp2:
9 (values 0.7 >=)
知道必须在脚本中更改哪些内容才能做到这一点吗?
欢迎任何评论。
您的代码计算 小于或等于 0.7 的值。
如果我改变:
if(0.7 >= $potential) {
至:
if(0.7 <= $potential) {
然后每个部分我得到9分。输出:
PART ADK66841.1_NA:
9 (values 0.7 >=)
PART YP_003858584.1_BtCoVBM48_gp2:
9 (values 0.7 >=)
请调查以下 re-worked perl 脚本是否有用。
注意:原始代码假定 header 基于指令 if($. == 1)
-- 参见 $.
实施了一些更改以提高脚本的可读性
- 在脚本顶部定义的变量
$threshold
- 用
next unless $. > 1
跳过header/first行(下一步,除非行计数器超过一个)
- 不仅在空格上分割线而且
;
也避免替代
$id
,$potential
在一条指令中从 @cols
数组中填充
- 字段编号调整为
;
之前的第一个字段将为空
- write with format 用于格式化输出
注意:参见 $~,它定义了 write
输出的当前格式,用于关闭 table
此脚本使用 __DATA__
块和最初发布的数据用于输出演示目的。
用 while( <> )
代替 while( <DATA> )
来改变代码,这样你就可以接受来自 STDIN
的输入,或者通过将文件名指定为脚本的参数(运行 作为./script.pl file.dat
).
#!/usr/bin/env perl
#
# vim: ai ts=4 sw=4
use strict;
use warnings;
my($id,$counter);
my $treshold = 0.7;
while( <DATA> ) {
chomp;
next unless $. > 1;
my @cols = split("[; ]+", $_);
next unless @cols == 7;
my($id,$potential) = @cols[1,4];
$counter->{$id}++ if $potential >= $treshold;
}
my @sorted_ids = sort { $a cmp $b } keys %$counter;
for $id (@sorted_ids) {
write;
}
$~ = "STDOUT_BOTTOM";
write;
exit 0;
format STDOUT_TOP =
Criteria: potential >= @#.##
$treshold
+-----------------------------+-------+
| Part | Count |
+-----------------------------+-------+
.
format STDOUT =
| @<<<<<<<<<<<<<<<<<<<<<<<<<< | @>>>> |
$id,$counter->{$id}
.
format STDOUT_BOTTOM =
+-----------------------------+-------+
.
__DATA__
;YP_003858584.1_BtCoVBM48_gp2 25 NKSP 0.1462 (9/9) ---
;YP_003858584.1_BtCoVBM48_gp2 66 NLTW 0.7837 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 116 NTTQ 0.7013 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 126 NGTH 0.7112 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 163 NCTY 0.7620 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 173 NIST 0.6556 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 231 NITY 0.7442 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 273 NGTI 0.7109 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 322 NITQ 0.6116 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 334 NITS 0.7296 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 361 NSSA 0.5388 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 462 NPSG 0.4656 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 541 NSTK 0.5883 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 590 NASS 0.5643 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 603 NCTD 0.7117 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 646 NSSY 0.5467 (4/9) +
;YP_003858584.1_BtCoVBM48_gp2 665 NVSS 0.7980 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 695 NNTI 0.4537 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 703 NFSI 0.5613 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 787 NFSQ 0.6209 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 1060 NFTT 0.4540 (6/9) -
;YP_003858584.1_BtCoVBM48_gp2 1084 NGTH 0.5408 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1120 NNTV 0.5803 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1144 NHTS 0.3828 (8/9) -
;YP_003858584.1_BtCoVBM48_gp2 1149 NVSL 0.4879 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 1159 NASV 0.5021 (3/9) +
;YP_003858584.1_BtCoVBM48_gp2 1180 NESL 0.5770 (7/9) +
;ADK66841.1_NA 25 NKSP 0.1462 (9/9) ---
;ADK66841.1_NA 66 NLTW 0.7837 (9/9) +++
;ADK66841.1_NA 116 NTTQ 0.7013 (9/9) ++
;ADK66841.1_NA 126 NGTH 0.7112 (9/9) ++
;ADK66841.1_NA 163 NCTY 0.7620 (9/9) +++
;ADK66841.1_NA 173 NIST 0.6556 (8/9) +
;ADK66841.1_NA 231 NITY 0.7442 (9/9) ++
;ADK66841.1_NA 273 NGTI 0.7109 (9/9) ++
;ADK66841.1_NA 322 NITQ 0.6116 (8/9) +
;ADK66841.1_NA 334 NITS 0.7296 (9/9) ++
;ADK66841.1_NA 361 NSSA 0.5388 (6/9) +
;ADK66841.1_NA 462 NPSG 0.4656 (5/9) -
;ADK66841.1_NA 541 NSTK 0.5883 (8/9) +
;ADK66841.1_NA 590 NASS 0.5643 (6/9) +
;ADK66841.1_NA 603 NCTD 0.7117 (9/9) ++
;ADK66841.1_NA 646 NSSY 0.5467 (4/9) +
;ADK66841.1_NA 665 NVSS 0.7980 (9/9) +++
;ADK66841.1_NA 695 NNTI 0.4537 (5/9) -
;ADK66841.1_NA 703 NFSI 0.5613 (9/9) ++
;ADK66841.1_NA 787 NFSQ 0.6209 (9/9) ++
;ADK66841.1_NA 1060 NFTT 0.4540 (6/9) -
;ADK66841.1_NA 1084 NGTH 0.5408 (6/9) +
;ADK66841.1_NA 1120 NNTV 0.5803 (6/9) +
;ADK66841.1_NA 1144 NHTS 0.3828 (8/9) -
;ADK66841.1_NA 1149 NVSL 0.4879 (5/9) -
;ADK66841.1_NA 1159 NASV 0.5021 (3/9) +
;ADK66841.1_NA 1180 NESL 0.5770 (7/9) +
输出
Criteria: potential >= 0.70
+-----------------------------+-------+
| Part | Count |
+-----------------------------+-------+
| ADK66841.1_NA | 9 |
| YP_003858584.1_BtCoVBM48_gp | 9 |
+-----------------------------+-------+
注:
您在 GitHub 上向我推荐的文件不包含数据文件中的前导 ;
。由于这个原因数字字段的计数减少了一个,导致没有得到任何结果。
请在 perl 脚本中进行以下更改:
next unless @cols == 7;
my($id,$potential) = @cols[1,4];
至
next unless @cols == 6;
my($id,$potential) = @cols[0,3];
前段时间,我请求帮助生成一个 Perl 脚本,该脚本计算文本文件中的值,分为多个部分。 当文本文件的某些行中出现正值时,此脚本会告诉我,然后当开始文本的另一部分时,再次告诉我正值的数量。 例如,这是我的文本文件:
;YP_003858584.1_BtCoVBM48_gp2 25 NKSP 0.1462 (9/9) ---
;YP_003858584.1_BtCoVBM48_gp2 66 NLTW 0.7837 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 116 NTTQ 0.7013 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 126 NGTH 0.7112 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 163 NCTY 0.7620 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 173 NIST 0.6556 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 231 NITY 0.7442 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 273 NGTI 0.7109 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 322 NITQ 0.6116 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 334 NITS 0.7296 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 361 NSSA 0.5388 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 462 NPSG 0.4656 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 541 NSTK 0.5883 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 590 NASS 0.5643 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 603 NCTD 0.7117 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 646 NSSY 0.5467 (4/9) +
;YP_003858584.1_BtCoVBM48_gp2 665 NVSS 0.7980 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 695 NNTI 0.4537 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 703 NFSI 0.5613 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 787 NFSQ 0.6209 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 1060 NFTT 0.4540 (6/9) -
;YP_003858584.1_BtCoVBM48_gp2 1084 NGTH 0.5408 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1120 NNTV 0.5803 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1144 NHTS 0.3828 (8/9) -
;YP_003858584.1_BtCoVBM48_gp2 1149 NVSL 0.4879 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 1159 NASV 0.5021 (3/9) +
;YP_003858584.1_BtCoVBM48_gp2 1180 NESL 0.5770 (7/9) +
;ADK66841.1_NA 25 NKSP 0.1462 (9/9) ---
;ADK66841.1_NA 66 NLTW 0.7837 (9/9) +++
;ADK66841.1_NA 116 NTTQ 0.7013 (9/9) ++
;ADK66841.1_NA 126 NGTH 0.7112 (9/9) ++
;ADK66841.1_NA 163 NCTY 0.7620 (9/9) +++
;ADK66841.1_NA 173 NIST 0.6556 (8/9) +
;ADK66841.1_NA 231 NITY 0.7442 (9/9) ++
;ADK66841.1_NA 273 NGTI 0.7109 (9/9) ++
;ADK66841.1_NA 322 NITQ 0.6116 (8/9) +
;ADK66841.1_NA 334 NITS 0.7296 (9/9) ++
;ADK66841.1_NA 361 NSSA 0.5388 (6/9) +
;ADK66841.1_NA 462 NPSG 0.4656 (5/9) -
;ADK66841.1_NA 541 NSTK 0.5883 (8/9) +
;ADK66841.1_NA 590 NASS 0.5643 (6/9) +
;ADK66841.1_NA 603 NCTD 0.7117 (9/9) ++
;ADK66841.1_NA 646 NSSY 0.5467 (4/9) +
;ADK66841.1_NA 665 NVSS 0.7980 (9/9) +++
;ADK66841.1_NA 695 NNTI 0.4537 (5/9) -
;ADK66841.1_NA 703 NFSI 0.5613 (9/9) ++
;ADK66841.1_NA 787 NFSQ 0.6209 (9/9) ++
;ADK66841.1_NA 1060 NFTT 0.4540 (6/9) -
;ADK66841.1_NA 1084 NGTH 0.5408 (6/9) +
;ADK66841.1_NA 1120 NNTV 0.5803 (6/9) +
;ADK66841.1_NA 1144 NHTS 0.3828 (8/9) -
;ADK66841.1_NA 1149 NVSL 0.4879 (5/9) -
;ADK66841.1_NA 1159 NASV 0.5021 (3/9) +
;ADK66841.1_NA 1180 NESL 0.5770 (7/9) +
此文件在出现正值时向我报告:只有 0.7 >= 是正值。文本文件有两部分:一部分用于 YP_003858584.1_BtCoVBM48_gp2,另一部分用于 ADK66841.1_NA.当你统计每个部分的正值(7>=)个数时,每个部分有 9 个正值。 我有很多这样的文件,有数百个部分,因此,我想知道一个关于 Perl 中的脚本来计算这些值的想法。 这是脚本:
use strict;
use warnings;
my $cnt = {};
while(my $line = <STDIN>) {
if($. == 1) {
next;
}else {
my @cols = split(m{\s+},$line);
if(@cols == 6) {
my $potential = $cols[3];
my $id = $cols[0];
$id =~ s{^\;}{};
if(0.7 >= $potential) {
$cnt->{$id}++;
};
};
};
};
my @ids_found = sort { $a cmp $b } (keys %$cnt);
for my $id (@ids_found) {
print "PART $id:\n";
print "$cnt->{$id} (values 0.7 >=)\n";
};
这工作正常,但是,我注意到输出中有错误。 输出:
$ cat Test00.txt | perl File_for_count_values.pl
PART ADK66841.1_NA:
18 (values 0.7 >=)
PART YP_003858584.1_BtCoVBM48_gp2:
18 (values 0.7 >=)
输出看起来不像我想要的那样,当计算这个脚本的值时加上每个部分的正值 (9 + 9 = 18)。 输出必须是:
$ cat Test00.txt | perl File_for_count_values.pl
PART ADK66841.1_NA:
9 (values 0.7 >=)
PART YP_003858584.1_BtCoVBM48_gp2:
9 (values 0.7 >=)
知道必须在脚本中更改哪些内容才能做到这一点吗?
欢迎任何评论。
您的代码计算 小于或等于 0.7 的值。
如果我改变:
if(0.7 >= $potential) {
至:
if(0.7 <= $potential) {
然后每个部分我得到9分。输出:
PART ADK66841.1_NA:
9 (values 0.7 >=)
PART YP_003858584.1_BtCoVBM48_gp2:
9 (values 0.7 >=)
请调查以下 re-worked perl 脚本是否有用。
注意:原始代码假定 header 基于指令 if($. == 1)
-- 参见 $.
实施了一些更改以提高脚本的可读性
- 在脚本顶部定义的变量
$threshold
- 用
next unless $. > 1
跳过header/first行(下一步,除非行计数器超过一个) - 不仅在空格上分割线而且
;
也避免替代 $id
,$potential
在一条指令中从@cols
数组中填充- 字段编号调整为
;
之前的第一个字段将为空 - write with format 用于格式化输出
注意:参见 $~,它定义了 write
输出的当前格式,用于关闭 table
此脚本使用 __DATA__
块和最初发布的数据用于输出演示目的。
用 while( <> )
代替 while( <DATA> )
来改变代码,这样你就可以接受来自 STDIN
的输入,或者通过将文件名指定为脚本的参数(运行 作为./script.pl file.dat
).
#!/usr/bin/env perl
#
# vim: ai ts=4 sw=4
use strict;
use warnings;
my($id,$counter);
my $treshold = 0.7;
while( <DATA> ) {
chomp;
next unless $. > 1;
my @cols = split("[; ]+", $_);
next unless @cols == 7;
my($id,$potential) = @cols[1,4];
$counter->{$id}++ if $potential >= $treshold;
}
my @sorted_ids = sort { $a cmp $b } keys %$counter;
for $id (@sorted_ids) {
write;
}
$~ = "STDOUT_BOTTOM";
write;
exit 0;
format STDOUT_TOP =
Criteria: potential >= @#.##
$treshold
+-----------------------------+-------+
| Part | Count |
+-----------------------------+-------+
.
format STDOUT =
| @<<<<<<<<<<<<<<<<<<<<<<<<<< | @>>>> |
$id,$counter->{$id}
.
format STDOUT_BOTTOM =
+-----------------------------+-------+
.
__DATA__
;YP_003858584.1_BtCoVBM48_gp2 25 NKSP 0.1462 (9/9) ---
;YP_003858584.1_BtCoVBM48_gp2 66 NLTW 0.7837 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 116 NTTQ 0.7013 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 126 NGTH 0.7112 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 163 NCTY 0.7620 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 173 NIST 0.6556 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 231 NITY 0.7442 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 273 NGTI 0.7109 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 322 NITQ 0.6116 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 334 NITS 0.7296 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 361 NSSA 0.5388 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 462 NPSG 0.4656 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 541 NSTK 0.5883 (8/9) +
;YP_003858584.1_BtCoVBM48_gp2 590 NASS 0.5643 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 603 NCTD 0.7117 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 646 NSSY 0.5467 (4/9) +
;YP_003858584.1_BtCoVBM48_gp2 665 NVSS 0.7980 (9/9) +++
;YP_003858584.1_BtCoVBM48_gp2 695 NNTI 0.4537 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 703 NFSI 0.5613 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 787 NFSQ 0.6209 (9/9) ++
;YP_003858584.1_BtCoVBM48_gp2 1060 NFTT 0.4540 (6/9) -
;YP_003858584.1_BtCoVBM48_gp2 1084 NGTH 0.5408 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1120 NNTV 0.5803 (6/9) +
;YP_003858584.1_BtCoVBM48_gp2 1144 NHTS 0.3828 (8/9) -
;YP_003858584.1_BtCoVBM48_gp2 1149 NVSL 0.4879 (5/9) -
;YP_003858584.1_BtCoVBM48_gp2 1159 NASV 0.5021 (3/9) +
;YP_003858584.1_BtCoVBM48_gp2 1180 NESL 0.5770 (7/9) +
;ADK66841.1_NA 25 NKSP 0.1462 (9/9) ---
;ADK66841.1_NA 66 NLTW 0.7837 (9/9) +++
;ADK66841.1_NA 116 NTTQ 0.7013 (9/9) ++
;ADK66841.1_NA 126 NGTH 0.7112 (9/9) ++
;ADK66841.1_NA 163 NCTY 0.7620 (9/9) +++
;ADK66841.1_NA 173 NIST 0.6556 (8/9) +
;ADK66841.1_NA 231 NITY 0.7442 (9/9) ++
;ADK66841.1_NA 273 NGTI 0.7109 (9/9) ++
;ADK66841.1_NA 322 NITQ 0.6116 (8/9) +
;ADK66841.1_NA 334 NITS 0.7296 (9/9) ++
;ADK66841.1_NA 361 NSSA 0.5388 (6/9) +
;ADK66841.1_NA 462 NPSG 0.4656 (5/9) -
;ADK66841.1_NA 541 NSTK 0.5883 (8/9) +
;ADK66841.1_NA 590 NASS 0.5643 (6/9) +
;ADK66841.1_NA 603 NCTD 0.7117 (9/9) ++
;ADK66841.1_NA 646 NSSY 0.5467 (4/9) +
;ADK66841.1_NA 665 NVSS 0.7980 (9/9) +++
;ADK66841.1_NA 695 NNTI 0.4537 (5/9) -
;ADK66841.1_NA 703 NFSI 0.5613 (9/9) ++
;ADK66841.1_NA 787 NFSQ 0.6209 (9/9) ++
;ADK66841.1_NA 1060 NFTT 0.4540 (6/9) -
;ADK66841.1_NA 1084 NGTH 0.5408 (6/9) +
;ADK66841.1_NA 1120 NNTV 0.5803 (6/9) +
;ADK66841.1_NA 1144 NHTS 0.3828 (8/9) -
;ADK66841.1_NA 1149 NVSL 0.4879 (5/9) -
;ADK66841.1_NA 1159 NASV 0.5021 (3/9) +
;ADK66841.1_NA 1180 NESL 0.5770 (7/9) +
输出
Criteria: potential >= 0.70
+-----------------------------+-------+
| Part | Count |
+-----------------------------+-------+
| ADK66841.1_NA | 9 |
| YP_003858584.1_BtCoVBM48_gp | 9 |
+-----------------------------+-------+
注:
您在 GitHub 上向我推荐的文件不包含数据文件中的前导 ;
。由于这个原因数字字段的计数减少了一个,导致没有得到任何结果。
请在 perl 脚本中进行以下更改:
next unless @cols == 7;
my($id,$potential) = @cols[1,4];
至
next unless @cols == 6;
my($id,$potential) = @cols[0,3];