gnuplot 中的月份

Months in gnuplot

我有一个包含 12 个数字的 csv 文件,对应于 12 个月。文件示例如下:

$ cat data.csv
"3","5","6","5","4","6","7","6","4","4","3","3",

我想用“一月、二月、三月等”在 x 轴上绘制月份。

我找到了这个脚本,但我不知道如何输入月份:

for FILE in data.csv; do
 gnuplot -p << EOF
 set datafile separator ","
 set xlabel "xlabel"
 set ylabel "ylabel"
 set title "graphTitle"
 plot "$FILE" using $xcolumn:$ycolumn
  EOF
done

预期的输出应该是一个图表,其中 x 轴是月份,y 轴是 csv 文件中的数据。 请注意,在 CSV 文件中没有月份,只有数字。这就是为什么我要问什么是最好的方法来实现这一点,而不必在 CSV 中手动输入它们或循环遍历数组。有没有gnuplot函数可以添加日期并可以格式化?

谢谢

更新:在查看 OP post 并编写更多代码后,我猜测 所需的格式如下所示:

January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",

如果是这种情况,我们可以使用相同的解决方案(如下)并将最终结果通过 tr 传输,将数据转置回 single-line/multi-column 数据集,例如:

$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' | tr ' \n' ':,'
January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",

并更新 OP 代码:

datfile=$(mktemp)
for FILE in data.csv
do
    paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' | tr ' \n' ':,' > "${datfile}"

    gnuplot -p <<-EOF
    set datafile separator ","
    set xlabel "xlabel"
    set ylabel "ylabel"
    set title "graphTitle"
    plot "${datfile}" using $xcolumn:$ycolumn
    EOF
done
'rm' -rf "${datfile}" > /dev/null 2>&1

看起来 gnuplot 可以接受各种格式的数据,包括以下格式:

January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"

注意:如果 OP 确定这不是可接受的文件格式,那么我相信我们可以想出别的办法……只需要用一个更新问题显示月份和数字的有效文件格式示例。

所以如果我们可以动态生成这个数据集,我们就可以将它提供给 gnuplot ...

首先我们让 locale 为我们生成月份:

$ locale mon
January;February;March;April;May;June;July;August;September;October;November;December

接下来我们可以将 single-line/multi-column 数据集转置为 multi-line/single-column 数据集:

$ locale mon | tr ';' '\n'
January
February
March
April
May
June
July
August
September
October
November
December

$ tr ',' '\n' < data.csv
"3"
"5"
"6"
"5"
"4"
"6"
"7"
"6"
"4"
"4"
"3"
"3"

从这里我们可以 paste 这两个数据集,使用 space 作为列分隔符:

$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv)
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"

最后一步是将其写入 (tmp) 文件,例如:

$ datfile=$(mktemp)
$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' > "${datfile}"
$ cat "${datfile}"
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"

注意: grep -v '^ $' 是去掉末尾与 [=26] 中最后一个逗号 (,) 相关的额外行=]

从这里 "${datfile}" 可以根据需要馈送到 gnuplot 并且一旦不再需要删除,例如:

$ gnuplot ... "${datfile}" ...
$ 'rm' -rf "${datfile}" > /dev/null 2>&1

一个 awk 解决方案围绕与 paste 答案相同的逻辑构建,但消除了一些子流程(例如,grep、多个 tr's) ...

awk -F'[;,]' '                             # input field delimiters are ";" and ","
BEGIN   { OFS=":" ; ORS="," }              # set output field delimiter as ":" and output record delimiter as ","
FNR==NR { for (i=1 ; i<=NF ; i++)          # loop through fields from first file ...
          month[i]=$(i)                    # store in our month[] array
          next                             # skip to next input line
        }
        { for (i=1 ; i< NF ; i++)          # loop through fields from second file ...
          print month[i],$(i)              # print month and current field
        }
' <(locale mon) data.csv

这会生成:

January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",

将其滚动到 OP 的代码中:

datfile=$(mktemp)
for FILE in data.csv
do
    awk -F'[;,]' 'BEGIN{OFS=":";ORS=","} FNR==NR {for (i=1;i<=NF;i++) mon[i]=$(i); next} {for (i=1;i<NF;i++) print mon[i],$(i)}' <(locale mon) data.csv > "${datfile}"

    gnuplot -p <<-EOF
    set datafile separator ","
    set xlabel "xlabel"
    set ylabel "ylabel"
    set title "graphTitle"
    plot "${datfile}" using $xcolumn:$ycolumn
    EOF
done
'rm' -rf "${datfile}" > /dev/null 2>&1

如果你不介意输入月份名称,我认为最简单的就是这个。为清楚起见,数据以内联方式显示,而不是从文件中读取。

$DATA << EOD
"3","5","6","5","4","6","7","6","4","4","3","3",
EOD

set datafile sep comma
set xrange [0:13]
unset key

array Month[12] = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]

plot for [N=1:12] $DATA using (N):(column(N)):xticlabel(Month[N]) with impulse lw 5

如果您不想输入月份名称,则应生成等效的以下内容。 "%b" 将生成如上所示的缩写月份名称。 "%B"将生成完整的月份名称。

Month(i) = strftime("%b", i * 3600.*24.*28.)
plot for [N=1:12] $DATA using (N):(column(N)):xticlabel(Month(N)) with impulse lw 5

如果您不想使用循环语法,可以将 CSV 文件读取为 1x12 矩阵。此外,对于长月份名称,您可以通过为其提供格式说明符“%B”来使用 gnuplot 的 strftime 函数。

gnuplot 脚本在这里。

set key noautotitle
set datafile separator comma
set yrange [0:10]
set xrange [-1:12]
set xtics rotate by -45
set grid xtics

# This function generates the names "January", "February", ... 
#                from the integer value 0, 1, ...
#
monthname(i) = strftime("%B",strptime("%m",sprintf("%i",i+1)))

# `matrix every ...` specifier tells to read the data as a 1x12 matrix.
#
plot "data.csv" matrix every :::0:11:0 using 1:3:xtic(monthname()) with linespoints pt 7 

另一种解决方案。因为你有一个尾随逗号并且 gnuplot 期望它后面有一个数字,所以你会收到一个警告 warning: matrix contains missing or undefined values ,你可以忽略它。因此,您应该将 x 最大值限制为更小的 12。 在您的情况下,将 $Data 替换为您的文件名 'data.csv'。您可能想要设置另一个语言环境(检查 help locale)以获得月份名称的其他语言。

代码:

### plot monthly data
reset session

$Data <<EOD
"3","5","6","5","4","6","7","6","4","4","3","3",
EOD

set datafile separator comma
set boxwidth 0.8
set style fill solid 0.5
set yrange[0:10]
set xrange[-0.9:11.9]
myMonth(i) = strftime("%b",i*3600*24*31)   # get month name as abbreviation, use %B for full name

plot $Data matrix u 1:0:xtic(myMonth()) w boxes title "my data"
### end of code

结果: