读取包含多个数字的 CSV 文件
Read CSV file it include several numbers
我有这样的数据,对于男性和女性列,我只需要第一行(突出显示的行)。请问我怎样才能只读这些数字并排除休息。
df <- structure(list(
col1 = c("First", "Frequency\nPercent", "CA", "TX"),
col2 = c("Sex_3585", "Male", "298026\n5\n9", "45678\n15\n89"),
col3 = c("", "Female", "57039\n10\n25", "64290\n100\n258")
),
class = "data.frame",
row.names = c(NA,-4L))
col1 col2 col3
1 First Sex_3585
2 Frequency\nPercent Male Female
3 CA 298026\n5\n9 57039\n10\n25
4 TX 45678\n15\n89 64290\n100\n258
首先,我创建了一个简单的数据示例。
df <- structure(list(
col1 = c("First", "Frequency\nPercent", "CA", "TX"),
col2 = c("Sex_3585", "Male", "298026\n5\n9", "45678\n15\n89"),
col3 = c("", "Female", "57039\n10\n25", "64290\n100\n258")
),
class = "data.frame",
row.names = c(NA,-4L))
col1 col2 col3
1 First Sex_3585
2 Frequency\nPercent Male Female
3 CA 298026\n5\n9 57039\n10\n25
4 TX 45678\n15\n89 64290\n100\n258
其次,在使用 read.csv
读取文件后,一种选择是分隔具有回车符 returns 的行(即 \n
)。然后,我们可以按第一列分组,每组只保留第一行。
library(tidyverse)
df %>%
separate_rows(everything(), sep = "\n") %>%
group_by(col1) %>%
filter(row_number()==1)
输出
col1 col2 col3
<chr> <chr> <chr>
1 First Sex_3585 ""
2 Frequency Male "Female"
3 Percent Male "Female"
4 CA 298026 "57039"
5 TX 45678 "64290"
我有这样的数据,对于男性和女性列,我只需要第一行(突出显示的行)。请问我怎样才能只读这些数字并排除休息。
df <- structure(list(
col1 = c("First", "Frequency\nPercent", "CA", "TX"),
col2 = c("Sex_3585", "Male", "298026\n5\n9", "45678\n15\n89"),
col3 = c("", "Female", "57039\n10\n25", "64290\n100\n258")
),
class = "data.frame",
row.names = c(NA,-4L))
col1 col2 col3
1 First Sex_3585
2 Frequency\nPercent Male Female
3 CA 298026\n5\n9 57039\n10\n25
4 TX 45678\n15\n89 64290\n100\n258
首先,我创建了一个简单的数据示例。
df <- structure(list(
col1 = c("First", "Frequency\nPercent", "CA", "TX"),
col2 = c("Sex_3585", "Male", "298026\n5\n9", "45678\n15\n89"),
col3 = c("", "Female", "57039\n10\n25", "64290\n100\n258")
),
class = "data.frame",
row.names = c(NA,-4L))
col1 col2 col3
1 First Sex_3585
2 Frequency\nPercent Male Female
3 CA 298026\n5\n9 57039\n10\n25
4 TX 45678\n15\n89 64290\n100\n258
其次,在使用 read.csv
读取文件后,一种选择是分隔具有回车符 returns 的行(即 \n
)。然后,我们可以按第一列分组,每组只保留第一行。
library(tidyverse)
df %>%
separate_rows(everything(), sep = "\n") %>%
group_by(col1) %>%
filter(row_number()==1)
输出
col1 col2 col3
<chr> <chr> <chr>
1 First Sex_3585 ""
2 Frequency Male "Female"
3 Percent Male "Female"
4 CA 298026 "57039"
5 TX 45678 "64290"