table 使用 awk headers 从 CSV 文件创建
table creation from CSV file with headers using awk
我有一个用 headers 逗号分隔的 CSV 文件,我想将它们包含在 table
中
输入:
header,word1,word2,word3
supercalifragi,black,white,red
adc,bad,cat,love
输出:
| header | word1 | word2 | word3 |
| -------------- | ----- | ----- | ----- |
| supercalifragi | black | white | red |
| adc | bad | cat | love |
我需要包含 headers 并且我需要考虑输入文件中单词的长度,以便完成的 table 格式正确
这是更新后的代码:
function pr(){
for(i=1;i<=NF;i++)
printf "| %-"len[i]+1"s",$i;
printf "|\n"
}
NR==FNR{
for(i=1;i<=NF;i++)
if(len[i]<length($i)){
len[i]=length($i);
word[i]=$i
}next
}{pr()}
FNR==1{
for(i=1;i<=NF;i++){
gsub(/./,"-",word[i]);
$i=word[i]};
pr()
}
``
我自由地从头开始重写了整个代码。这应该有效:
BEGIN {
FS=","
OFS=" | "
for (i=1; i<=NF; i++) {
transientLength[i] = 0
}
}
{
if(NR==1) {
# read headers
for (i=0; i<NF; i++) {
headers[i] = $(i+1)
transientLength[i] = (length($(i+1))>=transientLength[i] ? length($(i+1)) : transientLength[i])
}
} else {
for (i=0; i<NF; i++) {
fields[NR][i] = $(i+1)
transientLength[i] = (length($(i+1))>=transientLength[i] ? length($(i+1)) : transientLength[i])
}
}
}
END {
# print header
for (j in headers) {
spaceLength = transientLength[j]-length(headers[j])
for (s=1;s<=spaceLength;s++) {
spaces = spaces" "
}
if (!printable) printable = headers[j] spaces
else printable = printable OFS headers[j] spaces
spaces = "" # garbage collection
}
printable = "| "printable" |"
print printable
printable = "" # garbage collection
# print alignments
for (j in transientLength) {
for (i=1;i<=transientLength[j];i++) {
sep = sep"-"
}
if (!printable) printable = sep
else printable = printable OFS sep
sep = "" # garbage collection
}
printable = "| "printable" |"
print printable
printable = "" # garbage collection
# print all rows
for (f in fields) {
for (j in fields[f]) {
spaceLength = transientLength[j]-length(fields[f][j])
for (s=1;s<=spaceLength;s++) {
spaces = spaces" "
}
if (!printable) printable = fields[f][j] spaces
else printable = printable OFS fields[f][j] spaces
spaces = "" # garbage collection
}
printable = "| "printable" |"
print printable
printable = "" # garbage collection
}
}
但请注意:您需要清除输入文件中不必要的空格。它应该是:
header,word1,word2,word3
supercalifragi,black,white,red
adc,bad,cat,love
或者,您可以使用 FS=", "
,但这实际上仅限于您的示例。
双扫描的较短替代方案
$ awk -F' *, *' 'function pr()
{for(i=1;i<=NF;i++) printf "| %-"len[i]+1"s",$i; printf "|\n"}
NR==FNR{for(i=1;i<=NF;i++)
if(len[i]<length($i)) {len[i]=length($i); word[i]=$i} next}
{pr()}
FNR==1{for(i=1;i<=NF;i++) {gsub(/./,"-",word[i]); $i=word[i]}; pr()}' file{,}
| header | word1 | word2 | word3 |
| -------------- | ----- | ----- | ----- |
| supercalifragi | black | white | red |
| adc | bad | cat | love |
这不完全您要求的输出,但也许这就是您真正需要的:
$ column -t -s, -o' | ' < file | awk '1; NR==1{gsub(/[^|]/,"-"); print}'
header | word1 | word2 | word3
---------------|-------|-------|------
supercalifragi | black | white | red
adc | bad | cat | love
我有一个用 headers 逗号分隔的 CSV 文件,我想将它们包含在 table
中输入:
header,word1,word2,word3
supercalifragi,black,white,red
adc,bad,cat,love
输出:
| header | word1 | word2 | word3 |
| -------------- | ----- | ----- | ----- |
| supercalifragi | black | white | red |
| adc | bad | cat | love |
我需要包含 headers 并且我需要考虑输入文件中单词的长度,以便完成的 table 格式正确
这是更新后的代码:
function pr(){
for(i=1;i<=NF;i++)
printf "| %-"len[i]+1"s",$i;
printf "|\n"
}
NR==FNR{
for(i=1;i<=NF;i++)
if(len[i]<length($i)){
len[i]=length($i);
word[i]=$i
}next
}{pr()}
FNR==1{
for(i=1;i<=NF;i++){
gsub(/./,"-",word[i]);
$i=word[i]};
pr()
}
``
我自由地从头开始重写了整个代码。这应该有效:
BEGIN {
FS=","
OFS=" | "
for (i=1; i<=NF; i++) {
transientLength[i] = 0
}
}
{
if(NR==1) {
# read headers
for (i=0; i<NF; i++) {
headers[i] = $(i+1)
transientLength[i] = (length($(i+1))>=transientLength[i] ? length($(i+1)) : transientLength[i])
}
} else {
for (i=0; i<NF; i++) {
fields[NR][i] = $(i+1)
transientLength[i] = (length($(i+1))>=transientLength[i] ? length($(i+1)) : transientLength[i])
}
}
}
END {
# print header
for (j in headers) {
spaceLength = transientLength[j]-length(headers[j])
for (s=1;s<=spaceLength;s++) {
spaces = spaces" "
}
if (!printable) printable = headers[j] spaces
else printable = printable OFS headers[j] spaces
spaces = "" # garbage collection
}
printable = "| "printable" |"
print printable
printable = "" # garbage collection
# print alignments
for (j in transientLength) {
for (i=1;i<=transientLength[j];i++) {
sep = sep"-"
}
if (!printable) printable = sep
else printable = printable OFS sep
sep = "" # garbage collection
}
printable = "| "printable" |"
print printable
printable = "" # garbage collection
# print all rows
for (f in fields) {
for (j in fields[f]) {
spaceLength = transientLength[j]-length(fields[f][j])
for (s=1;s<=spaceLength;s++) {
spaces = spaces" "
}
if (!printable) printable = fields[f][j] spaces
else printable = printable OFS fields[f][j] spaces
spaces = "" # garbage collection
}
printable = "| "printable" |"
print printable
printable = "" # garbage collection
}
}
但请注意:您需要清除输入文件中不必要的空格。它应该是:
header,word1,word2,word3
supercalifragi,black,white,red
adc,bad,cat,love
或者,您可以使用 FS=", "
,但这实际上仅限于您的示例。
双扫描的较短替代方案
$ awk -F' *, *' 'function pr()
{for(i=1;i<=NF;i++) printf "| %-"len[i]+1"s",$i; printf "|\n"}
NR==FNR{for(i=1;i<=NF;i++)
if(len[i]<length($i)) {len[i]=length($i); word[i]=$i} next}
{pr()}
FNR==1{for(i=1;i<=NF;i++) {gsub(/./,"-",word[i]); $i=word[i]}; pr()}' file{,}
| header | word1 | word2 | word3 |
| -------------- | ----- | ----- | ----- |
| supercalifragi | black | white | red |
| adc | bad | cat | love |
这不完全您要求的输出,但也许这就是您真正需要的:
$ column -t -s, -o' | ' < file | awk '1; NR==1{gsub(/[^|]/,"-"); print}'
header | word1 | word2 | word3
---------------|-------|-------|------
supercalifragi | black | white | red
adc | bad | cat | love