将子字符串复制到下面的字符串,条件是两个字符串的内容
Copying a substring to a string below, conditional on the contents of both strings
我的数据看起来像这样:
A toberevised
8: <NA>
9: <NA>
10: Number of returns
11: Number of joint returns
12: Number with paid preparer's signature
13: Number of exemptions
14: Adjusted gross income (AGI) [3]
14: Adjusted gross income (AGI) [3]
**15: Salaries and wages in AGI: [4] Number
16: Amount
17: Taxable interest: Number
18: Amount
19: Ordinary dividends: Number
20: Amount**
21: <NA>
22: <NA>
23: Number of returns
24: Number of joint returns
25: Number with paid preparer's signature
26: Number of exemptions
DF <- structure(list(toberevised = c("[Money amounts are in thousands of dollars]",
NA, NA, NA, "Item", NA, NA, NA, NA, "Number of returns", "Number of joint returns",
"Number with paid preparer's signature", "Number of exemptions",
"Adjusted gross income (AGI) [3]", "Salaries and wages in AGI: [4] Number",
"Amount", "Taxable interest: Number", "Amount", "Ordinary dividends: Number",
"Amount")), row.names = c(NA, -20L), class = c("data.table",
"data.frame"))
我想写一段代码,将第15、17和19行:
之前的部分复制到其他行Amount
之前,所以:
A toberevised
8: <NA>
9: <NA>
10: Number of returns
11: Number of joint returns
12: Number with paid preparer's signature
13: Number of exemptions
14: Adjusted gross income (AGI) [3]
**15: Salaries and wages in AGI: [4] Number
16: Salaries and wages in AGI: Amount
17: Taxable interest: Number
18: Taxable interest: Amount
19: Ordinary dividends: Number
20: Ordinary dividends: Amount**
21: <NA>
22: <NA>
23: Number of returns
24: Number of joint returns
25: Number with paid preparer's signature
26: Number of exemptions
我尝试了一些非常笨拙的解决方案,比如将具有 :
的单元格复制到一个新列,填充该列,然后尝试从该列中删除 Number
,然后我可以连接列,之后我必须删除所有 debree。
DF <- setDT(DF)[grepl(":", DF$toberevised), type:=toberevised]
DF$type <- na.locf(DF$type, na.rm=FALSE)
DF$type <- gsub("[[:punct:]]*Number[[:punct:]]*", "", DF$type)
DF$fullname <- paste(DF$type,DF$toberevised)
除了行不通之外,还有点麻烦。
执行此操作的更好方法是什么?我正在考虑检查一个单元格是否有 : Number
并且下面的单元格是否有 Amount
在下面的字符串之前粘贴 :
之前的子字符串。但是我不知道怎么写这样的东西..
可能的解决方案之一
#Sample data
Sno <- c(1:8)
Values <- c("Number of returns", "Number of joint returns", "Salaries and wages in AGI: [4] Number", "Amount", "Taxable interest: Number", "Amount", "Ordinary dividends: Number", "Amount")
df <- data.frame(Sno, Values, stringsAsFactors = FALSE)
df
# Sno Values
# 1 Number of returns
# 2 Number of joint returns
# 3 Salaries and wages in AGI: [4] Number
# 4 Amount
# 5 Taxable interest: Number
# 6 Amount
# 7 Ordinary dividends: Number
# 8 Amount
for(i in 2:nrow(df)){
if(df[i,2]=="Amount" && grepl("Number",df[i-1,2])){
df[i,2] <- paste0(strsplit(df[i-1,2],":", fixed = TRUE)[[1]][[1]],": ",df[i,2])
}
}
#Updated dataframe
# Sno Values
# 1 Number of returns
# 2 Number of joint returns
# 3 Salaries and wages in AGI: [4] Number
# 4 Salaries and wages in AGI: Amount
# 5 Taxable interest: Number
# 6 Taxable interest: Amount
# 7 Ordinary dividends: Number
# 8 Ordinary dividends: Amount
希望对您有所帮助。
你可以这样做:
#Get the index of row where current row has "Amount" and previous had "Number"
library(data.table)
inds <- which(DF$toberevised == 'Amount' & shift(grepl('Number', DF$toberevised)))
#Paste those rows with revised value from previous row.
DF$toberevised[inds] <- paste0(sub(':.*', '', DF$toberevised[inds - 1]),
': Amount')
我的数据看起来像这样:
A toberevised
8: <NA>
9: <NA>
10: Number of returns
11: Number of joint returns
12: Number with paid preparer's signature
13: Number of exemptions
14: Adjusted gross income (AGI) [3]
14: Adjusted gross income (AGI) [3]
**15: Salaries and wages in AGI: [4] Number
16: Amount
17: Taxable interest: Number
18: Amount
19: Ordinary dividends: Number
20: Amount**
21: <NA>
22: <NA>
23: Number of returns
24: Number of joint returns
25: Number with paid preparer's signature
26: Number of exemptions
DF <- structure(list(toberevised = c("[Money amounts are in thousands of dollars]",
NA, NA, NA, "Item", NA, NA, NA, NA, "Number of returns", "Number of joint returns",
"Number with paid preparer's signature", "Number of exemptions",
"Adjusted gross income (AGI) [3]", "Salaries and wages in AGI: [4] Number",
"Amount", "Taxable interest: Number", "Amount", "Ordinary dividends: Number",
"Amount")), row.names = c(NA, -20L), class = c("data.table",
"data.frame"))
我想写一段代码,将第15、17和19行:
之前的部分复制到其他行Amount
之前,所以:
A toberevised
8: <NA>
9: <NA>
10: Number of returns
11: Number of joint returns
12: Number with paid preparer's signature
13: Number of exemptions
14: Adjusted gross income (AGI) [3]
**15: Salaries and wages in AGI: [4] Number
16: Salaries and wages in AGI: Amount
17: Taxable interest: Number
18: Taxable interest: Amount
19: Ordinary dividends: Number
20: Ordinary dividends: Amount**
21: <NA>
22: <NA>
23: Number of returns
24: Number of joint returns
25: Number with paid preparer's signature
26: Number of exemptions
我尝试了一些非常笨拙的解决方案,比如将具有 :
的单元格复制到一个新列,填充该列,然后尝试从该列中删除 Number
,然后我可以连接列,之后我必须删除所有 debree。
DF <- setDT(DF)[grepl(":", DF$toberevised), type:=toberevised]
DF$type <- na.locf(DF$type, na.rm=FALSE)
DF$type <- gsub("[[:punct:]]*Number[[:punct:]]*", "", DF$type)
DF$fullname <- paste(DF$type,DF$toberevised)
除了行不通之外,还有点麻烦。
执行此操作的更好方法是什么?我正在考虑检查一个单元格是否有 : Number
并且下面的单元格是否有 Amount
在下面的字符串之前粘贴 :
之前的子字符串。但是我不知道怎么写这样的东西..
可能的解决方案之一
#Sample data
Sno <- c(1:8)
Values <- c("Number of returns", "Number of joint returns", "Salaries and wages in AGI: [4] Number", "Amount", "Taxable interest: Number", "Amount", "Ordinary dividends: Number", "Amount")
df <- data.frame(Sno, Values, stringsAsFactors = FALSE)
df
# Sno Values
# 1 Number of returns
# 2 Number of joint returns
# 3 Salaries and wages in AGI: [4] Number
# 4 Amount
# 5 Taxable interest: Number
# 6 Amount
# 7 Ordinary dividends: Number
# 8 Amount
for(i in 2:nrow(df)){
if(df[i,2]=="Amount" && grepl("Number",df[i-1,2])){
df[i,2] <- paste0(strsplit(df[i-1,2],":", fixed = TRUE)[[1]][[1]],": ",df[i,2])
}
}
#Updated dataframe
# Sno Values
# 1 Number of returns
# 2 Number of joint returns
# 3 Salaries and wages in AGI: [4] Number
# 4 Salaries and wages in AGI: Amount
# 5 Taxable interest: Number
# 6 Taxable interest: Amount
# 7 Ordinary dividends: Number
# 8 Ordinary dividends: Amount
希望对您有所帮助。
你可以这样做:
#Get the index of row where current row has "Amount" and previous had "Number"
library(data.table)
inds <- which(DF$toberevised == 'Amount' & shift(grepl('Number', DF$toberevised)))
#Paste those rows with revised value from previous row.
DF$toberevised[inds] <- paste0(sub(':.*', '', DF$toberevised[inds - 1]),
': Amount')