连接列并添加数字 awk
Concatenate columns and adds digits awk
我有一个 csv 文件:
number1;number2;min_length;max_length
"40";"1801";8;8
"40";"182";8;8
"42";"32";6;8
"42";"4";6;6
"43";"691";9;9
我希望输出为:
4018010000;4018019999
4018200000;4018299999
42320000;42329999
423200000;423299999
4232000000;4232999999
42400000;42499999
43691000000;43691999999
因此新文件将包含:
column_1 = a concatenation of old_column_1 + old_column_2 + a number
of "0" equal to (old_column_3 - length of the old_column_2)
column_2 = a concatenation of old_column_1 + old_column_2 + a number of "9" equal
to (old_column_3 - length of the old_column_2) , when min_length = max_length. And when min_length is not equal with max_length , I need to take into account all the possible lengths. So for the line "42";"32";6;8 , all the lengths are: 6,7 and 8.
此外,我需要删除所有引号。
我试过像那样粘贴和剪切:
paste -d ";" <(cut -f1,2 -d ";" < file1) > file2
用于连接前两列,但我认为使用 awk 更容易。但是,我不知道该怎么做。任何帮助都值得赞赏。谢谢!
编辑:实际上,在输入中添加了第 4 列。
您可以使用这个 awk
:
awk 'function padstr(ch, len, s) {
s = sprintf("%*s", len, "")
gsub(/ /, ch, s)
return s
}
BEGIN {
FS=OFS=";"
}
{
gsub(/"/, "");
for (i=0; i<=(-); i++) {
d = - length() + i
print padstr("0", d), padstr("9", d)
}
}' file
4018010000;4018019999
4018200000;4018299999
42320000;42329999
423200000;423299999
4232000000;4232999999
42400000;42499999
43691000000;43691999999
使用 awk:
awk '
BEGIN{FS = OFS = ";"} # set field separator and output field separator to be ";"
{
[=10=] = gensub("\"", "", "g"); # Drop double quotes
s = ; # The range header number
l = -length(); # Number of zeros or 9s to be appended
l = 10^l; # Get 10 raised to that number
print s*l, (s+1)*l-1; # Adding n zeros is multiplication by 10^n
# Adding n nines is multipliaction by 10^n + (10^n - 1)
}' input.txt
内嵌注释的说明。
我有一个 csv 文件:
number1;number2;min_length;max_length
"40";"1801";8;8
"40";"182";8;8
"42";"32";6;8
"42";"4";6;6
"43";"691";9;9
我希望输出为:
4018010000;4018019999
4018200000;4018299999
42320000;42329999
423200000;423299999
4232000000;4232999999
42400000;42499999
43691000000;43691999999
因此新文件将包含:
column_1 = a concatenation of old_column_1 + old_column_2 + a number of "0" equal to (old_column_3 - length of the old_column_2)
column_2 = a concatenation of old_column_1 + old_column_2 + a number of "9" equal to (old_column_3 - length of the old_column_2) , when min_length = max_length. And when min_length is not equal with max_length , I need to take into account all the possible lengths. So for the line "42";"32";6;8 , all the lengths are: 6,7 and 8.
此外,我需要删除所有引号。
我试过像那样粘贴和剪切:
paste -d ";" <(cut -f1,2 -d ";" < file1) > file2
用于连接前两列,但我认为使用 awk 更容易。但是,我不知道该怎么做。任何帮助都值得赞赏。谢谢!
编辑:实际上,在输入中添加了第 4 列。
您可以使用这个 awk
:
awk 'function padstr(ch, len, s) {
s = sprintf("%*s", len, "")
gsub(/ /, ch, s)
return s
}
BEGIN {
FS=OFS=";"
}
{
gsub(/"/, "");
for (i=0; i<=(-); i++) {
d = - length() + i
print padstr("0", d), padstr("9", d)
}
}' file
4018010000;4018019999
4018200000;4018299999
42320000;42329999
423200000;423299999
4232000000;4232999999
42400000;42499999
43691000000;43691999999
使用 awk:
awk '
BEGIN{FS = OFS = ";"} # set field separator and output field separator to be ";"
{
[=10=] = gensub("\"", "", "g"); # Drop double quotes
s = ; # The range header number
l = -length(); # Number of zeros or 9s to be appended
l = 10^l; # Get 10 raised to that number
print s*l, (s+1)*l-1; # Adding n zeros is multiplication by 10^n
# Adding n nines is multipliaction by 10^n + (10^n - 1)
}' input.txt
内嵌注释的说明。