将提取的日期复制到行中,只要日期不变 - R 3.1.1 / Windows
copying extracted date into row, as long as date does not change - R 3.1.1 / Windows
我花了一些时间来解决复制问题。我正在尝试从数据框中提取日期并将其复制到新行中,只要日期不变即可。
我发现了一些关于不同问题的讨论,但我在 ifelse 和 for 循环中没有得到正确的结果。
通过删除非页脚信息和添加列日期来清理数据,VECCsample2 是我使用
以上 link 中的示例
data <- read.table('./VECCsample2.txt', sep = "", na.strings = "99999", fill=TRUE, header=FALSE, stringsAsFactors=FALSE)
# removing unneeded rows
data <- data[data$V1 != "Station", ]
data <- data[data$V1 != "-----------------------------------------------------------------------------", ]
data <- data[data$V1 != "Showalter",]
data <- data[data$V1 != "Lifted",]
data <- data[data$V1 != "LIFT",]
data <- data[data$V1 != "SWEAT",]
data <- data[data$V1 != "K",]
data <- data[data$V1 != "Cross",]
data <- data[data$V1 != "Vertical",]
data <- data[data$V1 != "Observation",]
data <- data[data$V1 != "Totals",]
data <- data[data$V1 != "Convective",]
data <- data[data$V1 != "CAPE",]
data <- data[data$V1 != "CINS",]
data <- data[data$V1 != "Equilibrum",]
data <- data[data$V1 != "Level",]
data <- data[data$V1 != "LFCT",]
data <- data[data$V1 != "Bulk",]
data <- data[data$V6 != "Condensation",]
data <- data[data$V1 != "Mean",]
data <- data[data$V6 != "thickness:",]
data <- data[data$V1 != "Precipitable",]
data <- data[data$V1 != "hPa",]
data <- data[data$V1 != "PRES",]
# renaming headers
names(data) <- c("PRES", "HGHT", "TEMP", "DEWP", "RELH", "MIXR", "DRCT", "SKNT", "THTA", "THTE", "THTV")
# adding empty date column
data$date <- 0
数据示例
row.names PRES HGHT TEMP DEWP RELH MIXR DRCT SKNT THTA THTE THTV date
1 1 42809.0 VECC Calcutta Observations at 00Z 01 January 2014 0
2 5 1004.0 6 28.2 26.0 88 21.64 180 3 301.0 365.1 304.9 0
3 6 1000.0 42 27.2 24.2 84 19.45 180 4 300.4 357.7 303.8 0
4 7 978.0 239 25.4 22.5 84 17.85 180 11 300.4 353.0 303.6 0
5 8 960.0 403 23.8 21.0 84 16.61 215 17 300.4 349.3 303.4 0
6 9 950.0 496 30.4 18.4 49 14.22 235 21 308.0 351.4 310.7 0
7 120 42809.0 VECC Calcutta Observations at 00Z 02 January 2014 0
8 124 1005.0 6 26.2 23.3 84 18.30 45 2 298.9 352.5 302.2 0
9 125 1000.0 50 25.0 20.5 76 15.43 60 6 298.1 343.2 300.9 0
10 126 974.0 282 23.4 20.6 84 15.95 108 14 298.8 345.4 301.6 0
11 127 965.0 364 23.5 20.8 85 16.35 125 17 299.7 347.7 302.6 0
我正在尝试提取日期(2014 年 1 月 1 日)并将其复制到日期列中。需要复制 2014 年 1 月 1 日,直到读取分隔符 "VECC"。输出应该类似于
row.names PRES HGHT TEMP DEWP RELH MIXR DRCT SKNT THTA THTE THTV date
1 1 42809.0 VECC Calcutta Observations at 00Z 01 January 2014 0
2 5 1004.0 6 28.2 26.0 88 21.64 180 3 301.0 365.1 304.9 01 January 2014
3 6 1000.0 42 27.2 24.2 84 19.45 180 4 300.4 357.7 303.8 01 January 2014
4 7 978.0 239 25.4 22.5 84 17.85 180 11 300.4 353.0 303.6 01 January 2014
5 8 960.0 403 23.8 21.0 84 16.61 215 17 300.4 349.3 303.4 01 January 2014
6 9 950.0 496 30.4 18.4 49 14.22 235 21 308.0 351.4 310.7 01 January 2014
7 120 42809.0 VECC Calcutta Observations at 00Z 02 January 2014 0
8 124 1005.0 6 26.2 23.3 84 18.30 45 2 298.9 352.5 302.2 02 January 2014
9 125 1000.0 50 25.0 20.5 76 15.43 60 6 298.1 343.2 300.9 02 January 2014
10 126 974.0 282 23.4 20.6 84 15.95 108 14 298.8 345.4 301.6 02 January 2014
11 127 965.0 364 23.5 20.8 85 16.35 125 17 299.7 347.7 302.6 02 January 2014
我尝试了 if、ifelse、for 循环的不同选项,但没有达到我想要的效果
提供最佳选择的选项是:
pattern = "VECC"
name_row <- grep(pattern, data$HGHT)
name_row_date <- data[name_row,]
if (grep(pattern, data$HGHT)) data$date <- name_row_date$DRCT
或
for (index in 1:nrow(data)) {
row = data[index, ]
if (row[index,3] == "VECC") {
date <- row[row$SKNT,]
data$date <- "NA"
}
if (row[index,3] != "VECC")
data$date <- date
}
有人有类似的练习或提示如何解决这个问题吗?
提前谢谢你。
您可以利用文档中的 html 标签执行以下操作,以获取日期。
library(XML)
file<- "http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2015&MONTH=01&FROM=0100&TO=0312&STNM=VECC"
wx<- htmlTreeParse(file,useInternalNodes=T)
heads<- xpathSApply(xmlRoot(wx),"//h2",xmlValue)
h<-unlist(strsplit(heads," Observations at "))
sounding.dates<- strptime(h[!seq_along(h)%%2],"%HZ %d %b %Y",tz="GMT")
这给出了与表格测深相对应的日期向量。
车站标识也在heads
向量中。
我花了一些时间来解决复制问题。我正在尝试从数据框中提取日期并将其复制到新行中,只要日期不变即可。
我发现了一些关于不同问题的讨论,但我在 ifelse 和 for 循环中没有得到正确的结果。
通过删除非页脚信息和添加列日期来清理数据,VECCsample2 是我使用
以上 link 中的示例data <- read.table('./VECCsample2.txt', sep = "", na.strings = "99999", fill=TRUE, header=FALSE, stringsAsFactors=FALSE)
# removing unneeded rows
data <- data[data$V1 != "Station", ]
data <- data[data$V1 != "-----------------------------------------------------------------------------", ]
data <- data[data$V1 != "Showalter",]
data <- data[data$V1 != "Lifted",]
data <- data[data$V1 != "LIFT",]
data <- data[data$V1 != "SWEAT",]
data <- data[data$V1 != "K",]
data <- data[data$V1 != "Cross",]
data <- data[data$V1 != "Vertical",]
data <- data[data$V1 != "Observation",]
data <- data[data$V1 != "Totals",]
data <- data[data$V1 != "Convective",]
data <- data[data$V1 != "CAPE",]
data <- data[data$V1 != "CINS",]
data <- data[data$V1 != "Equilibrum",]
data <- data[data$V1 != "Level",]
data <- data[data$V1 != "LFCT",]
data <- data[data$V1 != "Bulk",]
data <- data[data$V6 != "Condensation",]
data <- data[data$V1 != "Mean",]
data <- data[data$V6 != "thickness:",]
data <- data[data$V1 != "Precipitable",]
data <- data[data$V1 != "hPa",]
data <- data[data$V1 != "PRES",]
# renaming headers
names(data) <- c("PRES", "HGHT", "TEMP", "DEWP", "RELH", "MIXR", "DRCT", "SKNT", "THTA", "THTE", "THTV")
# adding empty date column
data$date <- 0
数据示例
row.names PRES HGHT TEMP DEWP RELH MIXR DRCT SKNT THTA THTE THTV date
1 1 42809.0 VECC Calcutta Observations at 00Z 01 January 2014 0
2 5 1004.0 6 28.2 26.0 88 21.64 180 3 301.0 365.1 304.9 0
3 6 1000.0 42 27.2 24.2 84 19.45 180 4 300.4 357.7 303.8 0
4 7 978.0 239 25.4 22.5 84 17.85 180 11 300.4 353.0 303.6 0
5 8 960.0 403 23.8 21.0 84 16.61 215 17 300.4 349.3 303.4 0
6 9 950.0 496 30.4 18.4 49 14.22 235 21 308.0 351.4 310.7 0
7 120 42809.0 VECC Calcutta Observations at 00Z 02 January 2014 0
8 124 1005.0 6 26.2 23.3 84 18.30 45 2 298.9 352.5 302.2 0
9 125 1000.0 50 25.0 20.5 76 15.43 60 6 298.1 343.2 300.9 0
10 126 974.0 282 23.4 20.6 84 15.95 108 14 298.8 345.4 301.6 0
11 127 965.0 364 23.5 20.8 85 16.35 125 17 299.7 347.7 302.6 0
我正在尝试提取日期(2014 年 1 月 1 日)并将其复制到日期列中。需要复制 2014 年 1 月 1 日,直到读取分隔符 "VECC"。输出应该类似于
row.names PRES HGHT TEMP DEWP RELH MIXR DRCT SKNT THTA THTE THTV date
1 1 42809.0 VECC Calcutta Observations at 00Z 01 January 2014 0
2 5 1004.0 6 28.2 26.0 88 21.64 180 3 301.0 365.1 304.9 01 January 2014
3 6 1000.0 42 27.2 24.2 84 19.45 180 4 300.4 357.7 303.8 01 January 2014
4 7 978.0 239 25.4 22.5 84 17.85 180 11 300.4 353.0 303.6 01 January 2014
5 8 960.0 403 23.8 21.0 84 16.61 215 17 300.4 349.3 303.4 01 January 2014
6 9 950.0 496 30.4 18.4 49 14.22 235 21 308.0 351.4 310.7 01 January 2014
7 120 42809.0 VECC Calcutta Observations at 00Z 02 January 2014 0
8 124 1005.0 6 26.2 23.3 84 18.30 45 2 298.9 352.5 302.2 02 January 2014
9 125 1000.0 50 25.0 20.5 76 15.43 60 6 298.1 343.2 300.9 02 January 2014
10 126 974.0 282 23.4 20.6 84 15.95 108 14 298.8 345.4 301.6 02 January 2014
11 127 965.0 364 23.5 20.8 85 16.35 125 17 299.7 347.7 302.6 02 January 2014
我尝试了 if、ifelse、for 循环的不同选项,但没有达到我想要的效果 提供最佳选择的选项是:
pattern = "VECC"
name_row <- grep(pattern, data$HGHT)
name_row_date <- data[name_row,]
if (grep(pattern, data$HGHT)) data$date <- name_row_date$DRCT
或
for (index in 1:nrow(data)) {
row = data[index, ]
if (row[index,3] == "VECC") {
date <- row[row$SKNT,]
data$date <- "NA"
}
if (row[index,3] != "VECC")
data$date <- date
}
有人有类似的练习或提示如何解决这个问题吗?
提前谢谢你。
您可以利用文档中的 html 标签执行以下操作,以获取日期。
library(XML)
file<- "http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2015&MONTH=01&FROM=0100&TO=0312&STNM=VECC"
wx<- htmlTreeParse(file,useInternalNodes=T)
heads<- xpathSApply(xmlRoot(wx),"//h2",xmlValue)
h<-unlist(strsplit(heads," Observations at "))
sounding.dates<- strptime(h[!seq_along(h)%%2],"%HZ %d %b %Y",tz="GMT")
这给出了与表格测深相对应的日期向量。
车站标识也在heads
向量中。