用 stat_peaks/stat_valleys 标记极值会产生重复的标签

Labeling extrema with stat_peaks/stat_valleys produces duplicate labels

我从 .nc 天气数据集(ncdf4 包)中提取了一些纵向温度数据,并想使用 ggplot2 及其扩展 ggpmisc,包括 stat_peaks/stat_valleys。奇怪的是,所有的标签读起来都一样:"Dec 1969".

我认为最有可能的罪魁祸首是我用于 x 轴的数据格式不正确 Date,但 x 轴显示正确并且我检查了 class输入数据进行确认。我还尝试应用 group=1,结果没有任何变化——我承认我是 R 和 ggplot2 的新手(更熟悉 Python/Pandas)并且不完全理解 group=1 的作用,尽管这是必要的使该行正确显示。也许这是错误的结果?

ggplot(df_denver, aes(x=Date, y=Temp..C., group=1)) + 
  geom_line() +
  scale_x_date(date_labels="%b %Y", date_breaks = "10 years", expand=c(0,0)) +
  stat_peaks(span=24, ignore_threshold = 0.80, color="red") +
  stat_peaks(geom="text", span=24, ignore_threshold = 0.80, x.label.fmt = "%b %Y", color="red", angle=90, hjust=-0.1) +
  stat_valleys(span=24, ignore_threshold = 0.55, color="blue") +
  stat_valleys(geom="text", span=24, ignore_threshold = 0.55, x.label.fmt = "%b %Y", color="blue", angle=90, hjust=1.1) +
  labs(x="Date", y="Temp (C)", title="Monthly Air Surface Temp for Denver from 1880 on")

这里是我的数据集的前 100 行,它们产生 3 个峰和 3 个谷来说明:

          Date    Temp..C.
1   1880-01-01  2.91287017
2   1880-02-01 -2.73586297
3   1880-03-01 -2.04185677
4   1880-04-01  0.37948364
5   1880-05-01  0.78548384
6   1880-06-01  0.44176754
7   1880-07-01 -1.06966007
8   1880-08-01 -0.53162575
9   1880-09-01 -0.29665694
10  1880-10-01 -2.08401608
11  1880-11-01 -9.46955109
12  1880-12-01 -1.52052176
13  1881-01-01 -2.53366208
14  1881-02-01 -1.88263988
15  1881-03-01 -0.06864686
16  1881-04-01  3.32321167
17  1881-05-01  1.75613177
18  1881-06-01  2.82765651
19  1881-07-01  1.76543093
20  1881-08-01  1.39409852
21  1881-09-01 -0.98141575
22  1881-10-01 -0.63346595
23  1881-11-01 -1.95676208
24  1881-12-01  3.28983855
25  1882-01-01 -0.64792717
26  1882-02-01  2.15854502
27  1882-03-01  2.91465187
28  1882-04-01  0.56616443
29  1882-05-01 -1.89441001
30  1882-06-01 -0.63149375
31  1882-07-01 -0.64883423
32  1882-08-01  0.82802373
33  1882-09-01  0.66150969
34  1882-10-01 -0.54113626
35  1882-11-01 -1.21310496
36  1882-12-01  1.30559540
37  1883-01-01 -1.41802752
38  1883-02-01 -6.39232874
39  1883-03-01  2.96320987
40  1883-04-01 -0.48122203
41  1883-05-01 -0.99614143
42  1883-06-01 -0.67229420
43  1883-07-01 -0.56595141
44  1883-08-01  0.52161294
45  1883-09-01  0.09190032
46  1883-10-01 -2.65115738
47  1883-11-01  1.88332438
48  1883-12-01 -0.19942272
49  1884-01-01 -0.34669495
50  1884-02-01 -2.21085262
51  1884-03-01  0.55254096
52  1884-04-01 -1.21859336
53  1884-05-01 -0.40969065
54  1884-06-01  0.44454563
55  1884-07-01  1.28881764
56  1884-08-01 -1.09331822
57  1884-09-01  1.52377772
58  1884-10-01  1.76569140
59  1884-11-01  0.72411090
60  1884-12-01 -4.64927006
61  1885-01-01 -1.03242493
62  1885-02-01 -0.79325873
63  1885-03-01  0.65910935
64  1885-04-01 -0.10181000
65  1885-05-01 -1.50702798
66  1885-06-01 -1.25801849
67  1885-07-01 -0.88433135
68  1885-08-01 -1.18410277
69  1885-09-01  0.15284735
70  1885-10-01 -0.91721576
71  1885-11-01  1.82403481
72  1885-12-01  1.68553519
73  1886-01-01 -4.21202993
74  1886-02-01  2.43953681
75  1886-03-01 -2.24947429
76  1886-04-01 -1.22557247
77  1886-05-01  2.66594267
78  1886-06-01 -0.21662886
79  1886-07-01  1.09909940
80  1886-08-01  0.63720244
81  1886-09-01 -0.11845125
82  1886-10-01  0.49225059
83  1886-11-01 -3.16969180
84  1886-12-01  2.18220520
85  1887-01-01  0.51427501
86  1887-02-01 -0.69656581
87  1887-03-01  3.96693182
88  1887-04-01  0.92614591
89  1887-05-01  1.66550291
90  1887-06-01  1.88668025
91  1887-07-01 -1.48990893
92  1887-08-01 -0.98355341
93  1887-09-01  0.93172997
94  1887-10-01 -1.12551820
95  1887-11-01  1.07798636
96  1887-12-01 -2.15758419
97  1888-01-01 -1.69266903
98  1888-02-01  2.55955243
99  1888-03-01 -1.83599913
100 1888-04-01  3.63450384

如您所见,stat_peaksstat_valleys产生的标签完全相同,甚至不在缩略数据的范围内,而不是x轴对应的正确日期。

Monthly Air Surface Temp for Denver from 1880 on

stat_peaksstat_valleys 标签将使用 POSIXct 格式的日期:

df_denver$Date <- as.POSIXct(df_denver$Date, format = "%Y-%m-%d")

ggplot(df_denver, aes(x=Date, y=Temp)) + 
  geom_line() +
  scale_x_datetime(date_labels="%b %Y", date_breaks = "1 year", expand=c(0,0)) +
  stat_peaks(span=24, ignore_threshold = 0.80, color="red") +
  stat_peaks(geom="text", span=24, ignore_threshold = 0.80, x.label.fmt = "%b %Y", color="red", angle=90, hjust=-0.1) +
  stat_valleys(span=24, ignore_threshold = 0.55, color="blue") +
  stat_valleys(geom="text", span=24, ignore_threshold = 0.55, x.label.fmt = "%b %Y", color="blue", angle=90, hjust=1.1) +
  labs(x="Date", y="Temp (C)", title="Monthly Air Surface Temp for Denver from 1880 on") +
  expand_limits(y = 6)

注意scale_x_date 已更改为 scale_x_datetime。此外,将 date_breaks 更改为 1 年以演示示例数据的 x 轴标签,并将 expand_limits 更改为确保峰值标签可读。 group=1 不需要。