沿 csv 中的行计算 R 中的岛屿
counting islands in R along rows in csv
我之前问过这个问题,Frank 回答了 。原题:
I would like to count islands along rows in a .csv. I say "islands"
meaning consecutive non-blank entries on rows of the .csv. If there
are three non-blank entries in a row, I would like that to be counted
as 1 island. Anything less than three consecutive entries in a row
counts as 1 "non-island". I would then like to write the output to a
dataframe:
我稍微更改了输入 .csv,现在包含多个 islands/gaps,这样行就不仅仅是 "island" 行或 "non-island" 行。有人有什么建议吗?
输入.csv:
Name,,,,,,,,,,,,,
Michael,,,1,1,1,,,,1,,,,
Peter,,,,1,1,,,,,,,,,
John,,,,,1,,,,,,,,,
Erin,,,,,1,1,,,,1,1,,,
所需的数据帧输出:
Name,island,nonisland,
Michael,1,1,
Peter,0,1,
John,0,1,
Erin,0,2
将上一个问题的代码添加到稍作修改以获得 nonisland
列
# sample data
df <- read.csv(text="
,,,,,,,,,,,,,
Michael,,,1,1,1,,,,1,,,,
Peter,,,,1,1,,,,,,,,,
John,,,,,1,,,,,,,,,
Erin,,,,,1,1,,,,1,1,,,")
output <- stack(sapply(apply(df, 1, rle),
function(x) sum(x$lengths >= 3)))
output$nonisland <- sapply(apply(df, 1, rle),
function(x) sum(x$lengths[!is.na(x$values)] < 3))
names(output) <- c("island", "names", "nonisland")
# values names nonisland
#1 1 Michael 1
#2 0 Peter 1
#3 0 John 1
#4 0 Erin 2
我之前问过这个问题,Frank 回答了
I would like to count islands along rows in a .csv. I say "islands" meaning consecutive non-blank entries on rows of the .csv. If there are three non-blank entries in a row, I would like that to be counted as 1 island. Anything less than three consecutive entries in a row counts as 1 "non-island". I would then like to write the output to a dataframe:
我稍微更改了输入 .csv,现在包含多个 islands/gaps,这样行就不仅仅是 "island" 行或 "non-island" 行。有人有什么建议吗?
输入.csv:
Name,,,,,,,,,,,,,
Michael,,,1,1,1,,,,1,,,,
Peter,,,,1,1,,,,,,,,,
John,,,,,1,,,,,,,,,
Erin,,,,,1,1,,,,1,1,,,
所需的数据帧输出:
Name,island,nonisland,
Michael,1,1,
Peter,0,1,
John,0,1,
Erin,0,2
将上一个问题的代码添加到稍作修改以获得 nonisland
列
# sample data
df <- read.csv(text="
,,,,,,,,,,,,,
Michael,,,1,1,1,,,,1,,,,
Peter,,,,1,1,,,,,,,,,
John,,,,,1,,,,,,,,,
Erin,,,,,1,1,,,,1,1,,,")
output <- stack(sapply(apply(df, 1, rle),
function(x) sum(x$lengths >= 3)))
output$nonisland <- sapply(apply(df, 1, rle),
function(x) sum(x$lengths[!is.na(x$values)] < 3))
names(output) <- c("island", "names", "nonisland")
# values names nonisland
#1 1 Michael 1
#2 0 Peter 1
#3 0 John 1
#4 0 Erin 2