导入的 csv 文件在 R studio 中保持平坦 table
Imported csv file remains a flat table in R studio
我有一个 CSV 文件,格式如下:
"Timestamp,Data,Quality"
"04/10/21 11:00:00,0.000000,0"
"04/10/21 11:02:00,0.014652,1"
"04/10/21 11:03:00,0.009768,1"
"04/10/21 11:04:00,0.014652,1"
.
.
.
为了将其导入 R,并将其转换为数据框,这就是我所做的。
library('tidyverse')
library('ggplot2')
library(dplyr)
mydata<-read.csv('C:/Users/tesge/Desktop/tnc results/syabas hydrotest/output/rev1/1/D2104100-3.csv', header = TRUE, sep = ",", stringAsFactors = FALSE)
mydata
但是我得到的输出仍然不是正确的 table 格式,在单独的 3 列中。
结果:
> head(mydata)
ï..Timestamp.Data.Quality
1 04/10/21 11:00:00,0.000000,0
2 04/10/21 11:02:00,0.014652,1
3 04/10/21 11:03:00,0.009768,1
4 04/10/21 11:04:00,0.014652,1
5 04/10/21 11:05:00,0.009768,1
6 04/10/21 11:07:00,0.000000,0
我不确定为什么第一列 header 那里有奇怪的字符。
新来的。请赐教。提前致谢。
Link 到 csv 文件:
https://www.dropbox.com/s/jim1ryaq2azulqg/D2104100-3.csv?dl=0
所有行都用引号引起来。删除它们,它应该可以工作。
在 CSV 中,引号用于分隔包含逗号的列,这意味着您的 CSV 实际上只包含一列。
library(tidyverse)
library(lubridate)
read_csv("D2104100-3.csv") %>%
separate(
col = 1,
into = c("Timestamp", "Data", "Quality"),
sep = ","
) %>%
mutate(Timestamp = dmy_hms(Timestamp),
Data = as.numeric(Data),
Quality = as.integer(Quality))
#>
#> -- Column specification --------------------------------------------------------
#> cols(
#> `Timestamp,Data,Quality` = col_character()
#> )
#> # A tibble: 1,466 x 3
#> Timestamp Data Quality
#> <dttm> <dbl> <int>
#> 1 2021-10-04 11:00:00 0 0
#> 2 2021-10-04 11:02:00 0.0147 1
#> 3 2021-10-04 11:03:00 0.00977 1
#> 4 2021-10-04 11:04:00 0.0147 1
#> 5 2021-10-04 11:05:00 0.00977 1
#> 6 2021-10-04 11:07:00 0 0
#> 7 2021-10-04 11:08:00 0.00977 1
#> 8 2021-10-04 11:14:00 0.0147 1
#> 9 2021-10-04 11:16:00 0.00977 1
#> 10 2021-10-04 11:22:00 0.00488 1
#> # ... with 1,456 more rows
我有一个 CSV 文件,格式如下:
"Timestamp,Data,Quality"
"04/10/21 11:00:00,0.000000,0"
"04/10/21 11:02:00,0.014652,1"
"04/10/21 11:03:00,0.009768,1"
"04/10/21 11:04:00,0.014652,1"
.
.
.
为了将其导入 R,并将其转换为数据框,这就是我所做的。
library('tidyverse')
library('ggplot2')
library(dplyr)
mydata<-read.csv('C:/Users/tesge/Desktop/tnc results/syabas hydrotest/output/rev1/1/D2104100-3.csv', header = TRUE, sep = ",", stringAsFactors = FALSE)
mydata
但是我得到的输出仍然不是正确的 table 格式,在单独的 3 列中。 结果:
> head(mydata)
ï..Timestamp.Data.Quality
1 04/10/21 11:00:00,0.000000,0
2 04/10/21 11:02:00,0.014652,1
3 04/10/21 11:03:00,0.009768,1
4 04/10/21 11:04:00,0.014652,1
5 04/10/21 11:05:00,0.009768,1
6 04/10/21 11:07:00,0.000000,0
我不确定为什么第一列 header 那里有奇怪的字符。 新来的。请赐教。提前致谢。 Link 到 csv 文件: https://www.dropbox.com/s/jim1ryaq2azulqg/D2104100-3.csv?dl=0
所有行都用引号引起来。删除它们,它应该可以工作。
在 CSV 中,引号用于分隔包含逗号的列,这意味着您的 CSV 实际上只包含一列。
library(tidyverse)
library(lubridate)
read_csv("D2104100-3.csv") %>%
separate(
col = 1,
into = c("Timestamp", "Data", "Quality"),
sep = ","
) %>%
mutate(Timestamp = dmy_hms(Timestamp),
Data = as.numeric(Data),
Quality = as.integer(Quality))
#>
#> -- Column specification --------------------------------------------------------
#> cols(
#> `Timestamp,Data,Quality` = col_character()
#> )
#> # A tibble: 1,466 x 3
#> Timestamp Data Quality
#> <dttm> <dbl> <int>
#> 1 2021-10-04 11:00:00 0 0
#> 2 2021-10-04 11:02:00 0.0147 1
#> 3 2021-10-04 11:03:00 0.00977 1
#> 4 2021-10-04 11:04:00 0.0147 1
#> 5 2021-10-04 11:05:00 0.00977 1
#> 6 2021-10-04 11:07:00 0 0
#> 7 2021-10-04 11:08:00 0.00977 1
#> 8 2021-10-04 11:14:00 0.0147 1
#> 9 2021-10-04 11:16:00 0.00977 1
#> 10 2021-10-04 11:22:00 0.00488 1
#> # ... with 1,456 more rows