将提取的日期复制到行中,只要日期不变 - R 3.1.1 / Windows

copying extracted date into row, as long as date does not change - R 3.1.1 / Windows

我花了一些时间来解决复制问题。我正在尝试从数据框中提取日期并将其复制到新行中,只要日期不变即可。

我发现了一些关于不同问题的讨论,但我在 ifelse 和 for 循环中没有得到正确的结果。

数据来自以下站点:http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2015&MONTH=01&FROM=0100&TO=0312&STNM=VECC)

通过删除非页脚信息和添加列日期来清理数据,VECCsample2 是我使用

以上 link 中的示例
data <- read.table('./VECCsample2.txt', sep = "", na.strings = "99999", fill=TRUE, header=FALSE, stringsAsFactors=FALSE)

# removing unneeded rows
data <- data[data$V1 != "Station", ]
data <- data[data$V1 != "-----------------------------------------------------------------------------", ]
data <- data[data$V1 != "Showalter",]
data <- data[data$V1 != "Lifted",]
data <- data[data$V1 != "LIFT",]
data <- data[data$V1 != "SWEAT",]
data <- data[data$V1 != "K",]
data <- data[data$V1 != "Cross",]
data <- data[data$V1 != "Vertical",]
data <- data[data$V1 != "Observation",]
data <- data[data$V1 != "Totals",]
data <- data[data$V1 != "Convective",]
data <- data[data$V1 != "CAPE",]
data <- data[data$V1 != "CINS",]
data <- data[data$V1 != "Equilibrum",]
data <- data[data$V1 != "Level",]
data <- data[data$V1 != "LFCT",]
data <- data[data$V1 != "Bulk",]
data <- data[data$V6 != "Condensation",]
data <- data[data$V1 != "Mean",]
data <- data[data$V6 != "thickness:",]
data <- data[data$V1 != "Precipitable",]
data <- data[data$V1 != "hPa",]
data <- data[data$V1 != "PRES",]

# renaming headers
names(data) <- c("PRES", "HGHT", "TEMP", "DEWP", "RELH", "MIXR", "DRCT", "SKNT", "THTA", "THTE", "THTV") 
# adding empty date column 
data$date <- 0

数据示例

row.names   PRES    HGHT    TEMP    DEWP    RELH    MIXR    DRCT    SKNT    THTA    THTE    THTV    date
1   1   42809.0 VECC    Calcutta    Observations    at  00Z 01  January 2014            0
2   5   1004.0  6   28.2    26.0    88  21.64   180 3   301.0   365.1   304.9   0
3   6   1000.0  42  27.2    24.2    84  19.45   180 4   300.4   357.7   303.8   0
4   7   978.0   239 25.4    22.5    84  17.85   180 11  300.4   353.0   303.6   0
5   8   960.0   403 23.8    21.0    84  16.61   215 17  300.4   349.3   303.4   0
6   9   950.0   496 30.4    18.4    49  14.22   235 21  308.0   351.4   310.7   0
7   120 42809.0 VECC    Calcutta    Observations    at  00Z 02  January 2014            0
8   124 1005.0  6   26.2    23.3    84  18.30   45  2   298.9   352.5   302.2   0
9   125 1000.0  50  25.0    20.5    76  15.43   60  6   298.1   343.2   300.9   0
10  126 974.0   282 23.4    20.6    84  15.95   108 14  298.8   345.4   301.6   0
11  127 965.0   364 23.5    20.8    85  16.35   125 17  299.7   347.7   302.6   0

我正在尝试提取日期(2014 年 1 月 1 日)并将其复制到日期列中。需要复制 2014 年 1 月 1 日,直到读取分隔符 "VECC"。输出应该类似于

row.names   PRES    HGHT    TEMP    DEWP    RELH    MIXR    DRCT    SKNT    THTA    THTE    THTV    date
1   1   42809.0 VECC    Calcutta    Observations    at  00Z 01  January 2014            0
2   5   1004.0  6   28.2    26.0    88  21.64   180 3   301.0   365.1   304.9   01 January 2014
3   6   1000.0  42  27.2    24.2    84  19.45   180 4   300.4   357.7   303.8   01 January 2014
4   7   978.0   239 25.4    22.5    84  17.85   180 11  300.4   353.0   303.6   01 January 2014
5   8   960.0   403 23.8    21.0    84  16.61   215 17  300.4   349.3   303.4   01 January 2014
6   9   950.0   496 30.4    18.4    49  14.22   235 21  308.0   351.4   310.7   01 January 2014
7   120 42809.0 VECC    Calcutta    Observations    at  00Z 02  January 2014            0
8   124 1005.0  6   26.2    23.3    84  18.30   45  2   298.9   352.5   302.2   02 January 2014
9   125 1000.0  50  25.0    20.5    76  15.43   60  6   298.1   343.2   300.9   02 January 2014
10  126 974.0   282 23.4    20.6    84  15.95   108 14  298.8   345.4   301.6   02 January 2014
11  127 965.0   364 23.5    20.8    85  16.35   125 17  299.7   347.7   302.6   02 January 2014

我尝试了 if、ifelse、for 循环的不同选项,但没有达到我想要的效果 提供最佳选择的选项是:

pattern = "VECC"
name_row <- grep(pattern, data$HGHT)
name_row_date <- data[name_row,]
if (grep(pattern, data$HGHT)) data$date <- name_row_date$DRCT

for (index in 1:nrow(data)) { 
    row = data[index, ] 

    if (row[index,3] == "VECC") {
        date <- row[row$SKNT,]
        data$date <- "NA"
    }
    if (row[index,3] != "VECC")
        data$date <- date
}

有人有类似的练习或提示如何解决这个问题吗?

提前谢谢你。

您可以利用文档中的 html 标签执行以下操作,以获取日期。

library(XML)
file<- "http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2015&MONTH=01&FROM=0100&TO=0312&STNM=VECC"
wx<- htmlTreeParse(file,useInternalNodes=T)
heads<- xpathSApply(xmlRoot(wx),"//h2",xmlValue)
h<-unlist(strsplit(heads," Observations at "))
sounding.dates<- strptime(h[!seq_along(h)%%2],"%HZ %d %b %Y",tz="GMT")

这给出了与表格测深相对应的日期向量。

车站标识也在heads向量中。