R根据条件行制作条件列
R Make a conditional column based on conditional row
我有一个长格式的数据集,但像这个例子一样有行分隔
<style type="text/css">
table.tableizer-table {
font-size: 12px;
border: 1px solid #CCC;
font-family: Arial, Helvetica, sans-serif;
}
.tableizer-table td {
padding: 4px;
margin: 3px;
border: 1px solid #CCC;
}
.tableizer-table th {
background-color: #104E8B;
color: #FFF;
font-weight: bold;
}
</style>
<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>First year</th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th></tr></thead><tbody>
<tr><td>8</td><td>101</td><td>6</td><td>OBL</td><td>Hist1</td><td>9</td><td>ORD</td><td>2020</td><td> </td><td>2081355</td><td>106</td></tr>
<tr><td>8</td><td>102</td><td>6</td><td>OBL</td><td>Eco1</td><td>6</td><td>ORD</td><td>2020</td><td> </td><td>2081395</td><td>106</td></tr>
<tr><td>Second year</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Hist2</td><td>5</td><td>ORD</td><td>2021</td><td> </td><td>2219787</td><td>202</td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Eco2</td><td>NP</td><td>ORD</td><td>2022</td><td> </td><td>2492841</td><td>206</td></tr>
</tbody></table>
所以我知道如何使用 mutate、case_when 和 ifelse 创建条件变量,我的预期结果是根据年份删除行和添加列。像这样。
<style type="text/css">
table.tableizer-table {
font-size: 12px;
border: 1px solid #CCC;
font-family: Arial, Helvetica, sans-serif;
}
.tableizer-table td {
padding: 4px;
margin: 3px;
border: 1px solid #CCC;
}
.tableizer-table th {
background-color: #104E8B;
color: #FFF;
font-weight: bold;
}
</style>
<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>name1</th><th>name2</th><th>name3</th><th>name4</th><th>name5</th><th>name6</th><th>name7</th><th>name8</th><th>name9</th><th>name10</th><th>name11</th><th>year</th></tr></thead><tbody>
<tr><td>8</td><td>101</td><td>6</td><td>OBL</td><td>Hist1</td><td>9</td><td>ORD</td><td>2020</td><td> </td><td>2081355</td><td>106</td><td>1</td></tr>
<tr><td>8</td><td>102</td><td>6</td><td>OBL</td><td>Eco1</td><td>6</td><td>ORD</td><td>2020</td><td> </td><td>2081395</td><td>106</td><td>1</td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Hist2</td><td>5</td><td>ORD</td><td>2021</td><td> </td><td>2219787</td><td>202</td><td>2</td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Eco2</td><td>NP</td><td>ORD</td><td>2022</td><td> </td><td>2492841</td><td>206</td><td>2</td></tr>
</tbody></table>
我的代码很少,所以你不必写。
library(tible)
df <- tribble(
~name1, ~name2,
"first year", NA,
"eco1", 'NP',
"hist1", '5',
"second year", NA,
"eco2", 'NP',
"hist2", '5'
)
您可以根据 name1
中的文本 "year"
或 name2
中的 NA
值来执行此操作。选择适合您的情况。
基于"year"
library(dplyr)
df %>%
mutate(year = cumsum(grepl('year', name1))) %>%
filter(!grepl('year', name1))
或基于 name2
中的 NA
值
df %>%
mutate(year = cumsum(is.na(name2))) %>%
filter(!is.na(name2))
两者都 return :
# name1 name2 year
# <chr> <chr> <int>
#1 eco1 NP 1
#2 hist1 5 1
#3 eco2 NP 2
#4 hist2 5 2
我有一个长格式的数据集,但像这个例子一样有行分隔
<style type="text/css">
table.tableizer-table {
font-size: 12px;
border: 1px solid #CCC;
font-family: Arial, Helvetica, sans-serif;
}
.tableizer-table td {
padding: 4px;
margin: 3px;
border: 1px solid #CCC;
}
.tableizer-table th {
background-color: #104E8B;
color: #FFF;
font-weight: bold;
}
</style>
<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>First year</th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th></tr></thead><tbody>
<tr><td>8</td><td>101</td><td>6</td><td>OBL</td><td>Hist1</td><td>9</td><td>ORD</td><td>2020</td><td> </td><td>2081355</td><td>106</td></tr>
<tr><td>8</td><td>102</td><td>6</td><td>OBL</td><td>Eco1</td><td>6</td><td>ORD</td><td>2020</td><td> </td><td>2081395</td><td>106</td></tr>
<tr><td>Second year</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Hist2</td><td>5</td><td>ORD</td><td>2021</td><td> </td><td>2219787</td><td>202</td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Eco2</td><td>NP</td><td>ORD</td><td>2022</td><td> </td><td>2492841</td><td>206</td></tr>
</tbody></table>
所以我知道如何使用 mutate、case_when 和 ifelse 创建条件变量,我的预期结果是根据年份删除行和添加列。像这样。
<style type="text/css">
table.tableizer-table {
font-size: 12px;
border: 1px solid #CCC;
font-family: Arial, Helvetica, sans-serif;
}
.tableizer-table td {
padding: 4px;
margin: 3px;
border: 1px solid #CCC;
}
.tableizer-table th {
background-color: #104E8B;
color: #FFF;
font-weight: bold;
}
</style>
<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>name1</th><th>name2</th><th>name3</th><th>name4</th><th>name5</th><th>name6</th><th>name7</th><th>name8</th><th>name9</th><th>name10</th><th>name11</th><th>year</th></tr></thead><tbody>
<tr><td>8</td><td>101</td><td>6</td><td>OBL</td><td>Hist1</td><td>9</td><td>ORD</td><td>2020</td><td> </td><td>2081355</td><td>106</td><td>1</td></tr>
<tr><td>8</td><td>102</td><td>6</td><td>OBL</td><td>Eco1</td><td>6</td><td>ORD</td><td>2020</td><td> </td><td>2081395</td><td>106</td><td>1</td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Hist2</td><td>5</td><td>ORD</td><td>2021</td><td> </td><td>2219787</td><td>202</td><td>2</td></tr>
<tr><td>8</td><td>204</td><td>6</td><td>OBL</td><td>Eco2</td><td>NP</td><td>ORD</td><td>2022</td><td> </td><td>2492841</td><td>206</td><td>2</td></tr>
</tbody></table>
我的代码很少,所以你不必写。
library(tible)
df <- tribble(
~name1, ~name2,
"first year", NA,
"eco1", 'NP',
"hist1", '5',
"second year", NA,
"eco2", 'NP',
"hist2", '5'
)
您可以根据 name1
中的文本 "year"
或 name2
中的 NA
值来执行此操作。选择适合您的情况。
基于"year"
library(dplyr)
df %>%
mutate(year = cumsum(grepl('year', name1))) %>%
filter(!grepl('year', name1))
或基于 name2
NA
值
df %>%
mutate(year = cumsum(is.na(name2))) %>%
filter(!is.na(name2))
两者都 return :
# name1 name2 year
# <chr> <chr> <int>
#1 eco1 NP 1
#2 hist1 5 1
#3 eco2 NP 2
#4 hist2 5 2