如何在 R 中合并此数据集?
How can I coalesce this dataset in R?
我是一名 ICU 医生,从事的研究涉及从 ICU 计算机系统中获取大量与患者相关的数据(所有数据都经过伦理批准等)。通常情况下,取出数据需要清理和整理才能正确使用。
我得到了一组数据,我已经尽力整理了。当然,我的数据科学技能非常初级,尽管我是一名热心的 R 用户,但我完全受阻,希望你们中的一些人能够阐明我的问题以及如何解决它。我绝对无法解决这个问题,但怀疑这是时间序列工作中经常遇到的问题。
目前,我的数据集现在包含每个时间点的多行。因此,在时间 X 有一个单独的行用于心率、血压等。有 46 个观察值,并且每个时间点都重复(该患者总共 344 个)。并非在每个时间点都记录所有观察结果。我已经提供了 link 到此数据排列方式的屏幕截图 here。
数据样本是 here,有帮助吗。
我取得的最大进步是使用以下嵌套 for 循环结构。它适用于第一组观察结果。我尝试了一种奇怪的 while 循环安排,结果完全失败了。
# First, add a group to the entire table specifying each time point that
# observations were conducted.
Patient_full$Verification_group <- as.numeric(as.factor(Patient_full$Time))
# Get the number of these groups
observation_times <- max(Patient_full$Verification_group)
# Create the bare bones of an overall table. This is the first row of the table.
patient_obs_final <- Patient_full[1,]
# Next I need to create a loop within loop. The master loop will coerce rows
# that have been created by the sub-loop.
for (i in 1 : observation_times) {
# Isolate the overall observation group you are dealing with
veri_group <- filter(Patient_full, Verification_group == i)
# Start by getting some numbers to run the sub-loop
lowest_obs_time_row <- min(veri_group$Row)
highest_obs_time_row <- max(veri_group$Row)
rows_in_obs_time <- (highest_obs_time_row - lowest_obs_time_row)
# We can run the sub-loop now
obs_at_timepoint <- Patient_full[lowest_obs_time_row, ]
for (j in 1 : (rows_in_obs_time - 1)) {
obs_at_timepoint <- coalesce(obs_at_timepoint, Patient_full[j + 1,])
}
patient_obs_final <- rbind(patient_obs_final, obs_at_timepoint)
}
patient_obs_final
一旦 j 变为 2,事情似乎就崩溃了。
所以,最后我的目标是每个时间点都有一个单独的行,并且该行包含当时 recorded/observed 的任何内容。我不知所措,甚至不知道为什么我的解决方案不起作用。任何建议将不胜感激。
试试这个dplyr
解决方案:
library(dplyr)
dat %>%
group_by(Time) %>%
mutate(
Cardiac.Rhythm = if_else(nzchar(Cardiac.Rhythm), Cardiac.Rhythm, NA_character_),
across(-Row, ~ .[order(is.na(.))])
) %>%
ungroup() %>%
filter(rowSums(!is.na(.)) > 2) %>%
as.data.frame()
# Row Time Base.excess..vt. Glucose.ABG Lactate.ABG PaCO2 PaO2 PH..ABG. Potassium.ABG Sodium.ABG Cardiac.Rhythm Arterial.Pressure.Diastolic Arterial.Pressure.Mean Arterial.Pressure.Systolic Heart.Rate
# 1 1 2017-09-04 17:00:00 -11.4 11.8 10.7 4.42 31.5 7.25 3.9 3.9 ST NA NA NA NA
# 2 10 2017-09-04 17:55:00 NA NA NA NA NA NA NA NA <NA> 54 68 92 123
# 3 14 2017-09-04 18:00:00 NA NA NA NA NA NA NA NA ST 60 71 86 123
# 4 23 2017-09-04 19:00:00 -9.3 10.1 9.7 4.22 15.0 7.30 3.9 3.9 ST 58 70 92 122
# 5 36 2017-09-04 20:00:00 -8.4 8.1 7.2 5.07 16.9 7.27 3.9 3.9 ST 62 80 117 NA
(我截断了粘贴在这里的列...)
演练:
- 由于某些原因,
Cardiac.Rhythm
有空字符串而不是 NA
,第一个 mutate 将空字符串 ""
转换为 NA
,以便以后的过滤工作;
.[order(is.na(.))]
在每列中首先排序非 NA
数据;
rowSums(.)
确保我们在一行中至少有一个非 NA
数据(> 2
考虑到 Row
和 Time
不是 NA
).
备注:
- 我假设数据是每帧一个“人”;如果数据中有患者 ID,请务必将其也添加到
group_by(.)
中。
- 在特定的
Time
(和 Patient_ID
,如果存在的话)中,我假设行的顺序并不重要(因此对值的每列重新排序)。
- 我不假设每个列只能有一个值
Time
;虽然从逻辑上讲这种情况是有道理的,但在此之前 data-scraping/aggregation 中也可能存在错误,因此我有意不假设 x[!is.na(x)]
(按 Time
分组时) 将始终 return 长度为 1。这将在特定的 Time
. 中显示为两行(或更多行)
- 我想过用
pivot_longer
来做这个,它仍然是可能的,但是......你这里有 numeric
和 character
数据,所以有点问题解决这个问题很好。
数据
dat <- structure(list(Row = 1:47, Time = c("2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:55:00", "2017-09-04 17:55:00", "2017-09-04 17:55:00", "2017-09-04 17:55:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00" ), Base.excess..vt. = c(-11.4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, -9.3, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, -8.4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Glucose.ABG = c(NA, 11.8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 10.1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 8.1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Lactate.ABG = c(NA, NA, 10.7, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9.7, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 7.2, NA, NA, NA, NA, NA, NA, NA, NA, NA), PaCO2 = c(NA, NA, NA, 4.42, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4.22, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 5.07, NA, NA, NA, NA, NA, NA, NA, NA), PaO2 = c(NA, NA, NA, NA, 31.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 15, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 16.9, NA, NA, NA, NA, NA, NA, NA), PH..ABG. = c(NA, NA, NA, NA, NA, 7.25, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 7.3, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 7.27, NA, NA, NA, NA, NA, NA), Potassium.ABG = c(NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA), Sodium.ABG = c(NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA), Cardiac.Rhythm = c("", "", "", "", "", "", "", "", "ST", "", "", "", "", "", "", "", "ST", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "ST", "", "", "", "", "", "", "", "", "", "", "", "", "ST"), Arterial.Pressure.Diastolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 54L, NA, NA, NA, 60L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 58L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 62L, NA, NA, NA), Arterial.Pressure.Mean = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 68L, NA, NA, NA, 71L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 70L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 80L, NA, NA), Arterial.Pressure.Systolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 92L, NA, NA, NA, 86L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 92L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 117L, NA ), Heart.Rate = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 123L, NA, NA, NA, NA, 123L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 122L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Non.Invasive.Arterial.Pressure.Diastolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 58L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Non.Invasive.Arterial.Pressure.Mean = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 71L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Non.Invasive.Arterial.Pressure.Systolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 108L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Tympanic.Temperature = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 37.6, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Patient.Positioning.ABG = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Central.Venous.Pressure = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Delivered.Percent.O2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Mean.Airway.Pressure.S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Minute.Volume.expired..S. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Peak.Inspiratory.Pressure.measured..S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Positive.End.Expiratory.pressure = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), S.Expired.Tidal.vol...breath. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), S.Tidal.Volume.Inspired = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Servo.i.Modes = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.FiO2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Flow.Trigger.S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA ), Set.Pause.time.. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.PEEP.Servo = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA ), Set.rate..CMV.or.SIMV. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Tidal.Volume..servo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Upper.Pressure.Limit = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Spontaneous.Rate..S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Cardiac.output..Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DO2.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DO2I.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Stroke.Volume.Index.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Stroke.Volume.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Systemic.Vascular.Resistance.Index.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Systemic.Vascular.Resistance.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Ionised.Calcium.ABG = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Pressure.Control.level.above.PEEP.S. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -47L))
正如@r2evans 指出的那样,这个计算也可以用 dplyr::pivot_wider
完成。
正如他所建议的那样,我们需要一些技巧,即将所有列转换为字符,然后在使用 pivot_wider
变回宽列后恢复到原来的 类
library(readxl)
library(dplyr)
df<-read.xlsx("Patient_01_sample.xlsx")
df %>% mutate(`Cardiac Rhythm`=replace(`Cardiac Rhythm`, `Cardiac Rhythm`=='', NA)) %>%
select(-1) %>%
mutate(across(-1, as.character)) %>%
pivot_longer(-1)%>%
group_by(name) %>%
filter(!is.na(value)) %>%
ungroup%>%
pivot_wider(id_cols=Time) %>%
mutate(across(-c(1, `Cardiac Rhythm`), as.numeric))
# A tibble: 5 x 18
Time `Base excess (vt)` `Glucose ABG` `Lactate ABG` PaCO2 PaO2 `PH (ABG)` `Potassium ABG`
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2017-09-04 17:00:00 -11.4 11.8 10.7 4.42 31.5 7.25 3.9
2 2017-09-04 17:55:00 NA NA NA NA NA NA NA
3 2017-09-04 18:00:00 NA NA NA NA NA NA NA
4 2017-09-04 19:00:00 -9.3 10.1 9.7 4.22 15 7.3 3.9
5 2017-09-04 20:00:00 -8.4 8.1 7.2 5.07 16.9 7.27 3.9
# … with 10 more variables: Sodium ABG <dbl>, Cardiac Rhythm <chr>, Arterial Pressure Diastolic <dbl>,
# Arterial Pressure Mean <dbl>, Arterial Pressure Systolic <dbl>, Heart Rate <dbl>,
# Non Invasive Arterial Pressure Diastolic <dbl>, Non Invasive Arterial Pressure Mean <dbl>,
# Non Invasive Arterial Pressure Systolic <dbl>, Tympanic Temperature <dbl>
不要循环执行。只要一个summarise_at
就够了!!
library(tidyverse)
Patient_01_sample <- read.csv("E:/R/Whosebug/Patient_01_sample.xlsx - Sheet 1.csv", row.names=1)
f = function(x) ifelse(length(x[!is.na(x)])==0,NA,x[!is.na(x)][1])
Patient = read_csv("Patient_01_sample.xlsx - Sheet 1.csv")
Patient %>% group_by(Time) %>%
summarise_at(vars(3:45), f)
输出
# A tibble: 5 x 44
Time `Glucose ABG` `Lactate ABG` PaCO2 PaO2 `PH (ABG)` `Potassium ABG` `Sodium ABG` `Cardiac Rhythm` `Arterial Pressure Di~
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 2017-09-04 17:00:00 11.8 10.7 4.42 31.5 7.25 3.9 3.9 ST NA
2 2017-09-04 17:55:00 NA NA NA NA NA NA NA NA 54
3 2017-09-04 18:00:00 NA NA NA NA NA NA NA ST 60
4 2017-09-04 19:00:00 10.1 9.7 4.22 15 7.3 3.9 3.9 ST 58
5 2017-09-04 20:00:00 8.1 7.2 5.07 16.9 7.27 3.9 3.9 ST 62
# ... with 34 more variables: Arterial Pressure Mean <dbl>, Arterial Pressure Systolic <dbl>, Heart Rate <dbl>,
# Non Invasive Arterial Pressure Diastolic <dbl>, Non Invasive Arterial Pressure Mean <dbl>,
# Non Invasive Arterial Pressure Systolic <dbl>, Tympanic Temperature <dbl>, Patient Positioning ABG <lgl>, Central Venous Pressure <lgl>,
# Delivered Percent O2 <lgl>, Mean Airway Pressure S <lgl>, Minute Volume expired (S) <lgl>, Peak Inspiratory Pressure measured S <lgl>,
# Positive End Expiratory pressure <lgl>, S Expired Tidal vol. (breath) <lgl>, S Tidal Volume Inspired <lgl>, Servo i Modes <lgl>,
# Set FiO2 <lgl>, Set Flow Trigger S <lgl>, Set Pause time % <lgl>, Set PEEP Servo <lgl>, Set rate (CMV or SIMV) <lgl>,
# Set Tidal Volume (servo) <lgl>, Set Upper Pressure Limit <lgl>, Spontaneous Rate S <lgl>, Cardiac output (Vigileo) <lgl>, ...
我是一名 ICU 医生,从事的研究涉及从 ICU 计算机系统中获取大量与患者相关的数据(所有数据都经过伦理批准等)。通常情况下,取出数据需要清理和整理才能正确使用。
我得到了一组数据,我已经尽力整理了。当然,我的数据科学技能非常初级,尽管我是一名热心的 R 用户,但我完全受阻,希望你们中的一些人能够阐明我的问题以及如何解决它。我绝对无法解决这个问题,但怀疑这是时间序列工作中经常遇到的问题。
目前,我的数据集现在包含每个时间点的多行。因此,在时间 X 有一个单独的行用于心率、血压等。有 46 个观察值,并且每个时间点都重复(该患者总共 344 个)。并非在每个时间点都记录所有观察结果。我已经提供了 link 到此数据排列方式的屏幕截图 here。
数据样本是 here,有帮助吗。
我取得的最大进步是使用以下嵌套 for 循环结构。它适用于第一组观察结果。我尝试了一种奇怪的 while 循环安排,结果完全失败了。
# First, add a group to the entire table specifying each time point that
# observations were conducted.
Patient_full$Verification_group <- as.numeric(as.factor(Patient_full$Time))
# Get the number of these groups
observation_times <- max(Patient_full$Verification_group)
# Create the bare bones of an overall table. This is the first row of the table.
patient_obs_final <- Patient_full[1,]
# Next I need to create a loop within loop. The master loop will coerce rows
# that have been created by the sub-loop.
for (i in 1 : observation_times) {
# Isolate the overall observation group you are dealing with
veri_group <- filter(Patient_full, Verification_group == i)
# Start by getting some numbers to run the sub-loop
lowest_obs_time_row <- min(veri_group$Row)
highest_obs_time_row <- max(veri_group$Row)
rows_in_obs_time <- (highest_obs_time_row - lowest_obs_time_row)
# We can run the sub-loop now
obs_at_timepoint <- Patient_full[lowest_obs_time_row, ]
for (j in 1 : (rows_in_obs_time - 1)) {
obs_at_timepoint <- coalesce(obs_at_timepoint, Patient_full[j + 1,])
}
patient_obs_final <- rbind(patient_obs_final, obs_at_timepoint)
}
patient_obs_final
一旦 j 变为 2,事情似乎就崩溃了。
所以,最后我的目标是每个时间点都有一个单独的行,并且该行包含当时 recorded/observed 的任何内容。我不知所措,甚至不知道为什么我的解决方案不起作用。任何建议将不胜感激。
试试这个dplyr
解决方案:
library(dplyr)
dat %>%
group_by(Time) %>%
mutate(
Cardiac.Rhythm = if_else(nzchar(Cardiac.Rhythm), Cardiac.Rhythm, NA_character_),
across(-Row, ~ .[order(is.na(.))])
) %>%
ungroup() %>%
filter(rowSums(!is.na(.)) > 2) %>%
as.data.frame()
# Row Time Base.excess..vt. Glucose.ABG Lactate.ABG PaCO2 PaO2 PH..ABG. Potassium.ABG Sodium.ABG Cardiac.Rhythm Arterial.Pressure.Diastolic Arterial.Pressure.Mean Arterial.Pressure.Systolic Heart.Rate
# 1 1 2017-09-04 17:00:00 -11.4 11.8 10.7 4.42 31.5 7.25 3.9 3.9 ST NA NA NA NA
# 2 10 2017-09-04 17:55:00 NA NA NA NA NA NA NA NA <NA> 54 68 92 123
# 3 14 2017-09-04 18:00:00 NA NA NA NA NA NA NA NA ST 60 71 86 123
# 4 23 2017-09-04 19:00:00 -9.3 10.1 9.7 4.22 15.0 7.30 3.9 3.9 ST 58 70 92 122
# 5 36 2017-09-04 20:00:00 -8.4 8.1 7.2 5.07 16.9 7.27 3.9 3.9 ST 62 80 117 NA
(我截断了粘贴在这里的列...)
演练:
- 由于某些原因,
Cardiac.Rhythm
有空字符串而不是NA
,第一个 mutate 将空字符串""
转换为NA
,以便以后的过滤工作; .[order(is.na(.))]
在每列中首先排序非NA
数据;rowSums(.)
确保我们在一行中至少有一个非NA
数据(> 2
考虑到Row
和Time
不是NA
).
备注:
- 我假设数据是每帧一个“人”;如果数据中有患者 ID,请务必将其也添加到
group_by(.)
中。 - 在特定的
Time
(和Patient_ID
,如果存在的话)中,我假设行的顺序并不重要(因此对值的每列重新排序)。 - 我不假设每个列只能有一个值
Time
;虽然从逻辑上讲这种情况是有道理的,但在此之前 data-scraping/aggregation 中也可能存在错误,因此我有意不假设x[!is.na(x)]
(按Time
分组时) 将始终 return 长度为 1。这将在特定的Time
. 中显示为两行(或更多行)
- 我想过用
pivot_longer
来做这个,它仍然是可能的,但是......你这里有numeric
和character
数据,所以有点问题解决这个问题很好。
数据
dat <- structure(list(Row = 1:47, Time = c("2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:00:00", "2017-09-04 17:55:00", "2017-09-04 17:55:00", "2017-09-04 17:55:00", "2017-09-04 17:55:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 18:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 19:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00", "2017-09-04 20:00:00" ), Base.excess..vt. = c(-11.4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, -9.3, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, -8.4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Glucose.ABG = c(NA, 11.8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 10.1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 8.1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Lactate.ABG = c(NA, NA, 10.7, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9.7, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 7.2, NA, NA, NA, NA, NA, NA, NA, NA, NA), PaCO2 = c(NA, NA, NA, 4.42, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4.22, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 5.07, NA, NA, NA, NA, NA, NA, NA, NA), PaO2 = c(NA, NA, NA, NA, 31.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 15, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 16.9, NA, NA, NA, NA, NA, NA, NA), PH..ABG. = c(NA, NA, NA, NA, NA, 7.25, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 7.3, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 7.27, NA, NA, NA, NA, NA, NA), Potassium.ABG = c(NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA), Sodium.ABG = c(NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.9, NA, NA, NA, NA, NA), Cardiac.Rhythm = c("", "", "", "", "", "", "", "", "ST", "", "", "", "", "", "", "", "ST", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "ST", "", "", "", "", "", "", "", "", "", "", "", "", "ST"), Arterial.Pressure.Diastolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 54L, NA, NA, NA, 60L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 58L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 62L, NA, NA, NA), Arterial.Pressure.Mean = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 68L, NA, NA, NA, 71L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 70L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 80L, NA, NA), Arterial.Pressure.Systolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 92L, NA, NA, NA, 86L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 92L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 117L, NA ), Heart.Rate = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 123L, NA, NA, NA, NA, 123L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 122L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Non.Invasive.Arterial.Pressure.Diastolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 58L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Non.Invasive.Arterial.Pressure.Mean = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 71L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Non.Invasive.Arterial.Pressure.Systolic = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 108L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Tympanic.Temperature = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 37.6, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Patient.Positioning.ABG = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Central.Venous.Pressure = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Delivered.Percent.O2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Mean.Airway.Pressure.S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Minute.Volume.expired..S. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Peak.Inspiratory.Pressure.measured..S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Positive.End.Expiratory.pressure = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), S.Expired.Tidal.vol...breath. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), S.Tidal.Volume.Inspired = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Servo.i.Modes = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.FiO2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Flow.Trigger.S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA ), Set.Pause.time.. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.PEEP.Servo = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA ), Set.rate..CMV.or.SIMV. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Tidal.Volume..servo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Upper.Pressure.Limit = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Spontaneous.Rate..S = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Cardiac.output..Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DO2.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DO2I.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Stroke.Volume.Index.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Stroke.Volume.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Systemic.Vascular.Resistance.Index.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Systemic.Vascular.Resistance.Vigileo. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Ionised.Calcium.ABG = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Set.Pressure.Control.level.above.PEEP.S. = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -47L))
正如@r2evans 指出的那样,这个计算也可以用 dplyr::pivot_wider
完成。
正如他所建议的那样,我们需要一些技巧,即将所有列转换为字符,然后在使用 pivot_wider
library(readxl)
library(dplyr)
df<-read.xlsx("Patient_01_sample.xlsx")
df %>% mutate(`Cardiac Rhythm`=replace(`Cardiac Rhythm`, `Cardiac Rhythm`=='', NA)) %>%
select(-1) %>%
mutate(across(-1, as.character)) %>%
pivot_longer(-1)%>%
group_by(name) %>%
filter(!is.na(value)) %>%
ungroup%>%
pivot_wider(id_cols=Time) %>%
mutate(across(-c(1, `Cardiac Rhythm`), as.numeric))
# A tibble: 5 x 18
Time `Base excess (vt)` `Glucose ABG` `Lactate ABG` PaCO2 PaO2 `PH (ABG)` `Potassium ABG`
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2017-09-04 17:00:00 -11.4 11.8 10.7 4.42 31.5 7.25 3.9
2 2017-09-04 17:55:00 NA NA NA NA NA NA NA
3 2017-09-04 18:00:00 NA NA NA NA NA NA NA
4 2017-09-04 19:00:00 -9.3 10.1 9.7 4.22 15 7.3 3.9
5 2017-09-04 20:00:00 -8.4 8.1 7.2 5.07 16.9 7.27 3.9
# … with 10 more variables: Sodium ABG <dbl>, Cardiac Rhythm <chr>, Arterial Pressure Diastolic <dbl>,
# Arterial Pressure Mean <dbl>, Arterial Pressure Systolic <dbl>, Heart Rate <dbl>,
# Non Invasive Arterial Pressure Diastolic <dbl>, Non Invasive Arterial Pressure Mean <dbl>,
# Non Invasive Arterial Pressure Systolic <dbl>, Tympanic Temperature <dbl>
不要循环执行。只要一个summarise_at
就够了!!
library(tidyverse)
Patient_01_sample <- read.csv("E:/R/Whosebug/Patient_01_sample.xlsx - Sheet 1.csv", row.names=1)
f = function(x) ifelse(length(x[!is.na(x)])==0,NA,x[!is.na(x)][1])
Patient = read_csv("Patient_01_sample.xlsx - Sheet 1.csv")
Patient %>% group_by(Time) %>%
summarise_at(vars(3:45), f)
输出
# A tibble: 5 x 44
Time `Glucose ABG` `Lactate ABG` PaCO2 PaO2 `PH (ABG)` `Potassium ABG` `Sodium ABG` `Cardiac Rhythm` `Arterial Pressure Di~
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 2017-09-04 17:00:00 11.8 10.7 4.42 31.5 7.25 3.9 3.9 ST NA
2 2017-09-04 17:55:00 NA NA NA NA NA NA NA NA 54
3 2017-09-04 18:00:00 NA NA NA NA NA NA NA ST 60
4 2017-09-04 19:00:00 10.1 9.7 4.22 15 7.3 3.9 3.9 ST 58
5 2017-09-04 20:00:00 8.1 7.2 5.07 16.9 7.27 3.9 3.9 ST 62
# ... with 34 more variables: Arterial Pressure Mean <dbl>, Arterial Pressure Systolic <dbl>, Heart Rate <dbl>,
# Non Invasive Arterial Pressure Diastolic <dbl>, Non Invasive Arterial Pressure Mean <dbl>,
# Non Invasive Arterial Pressure Systolic <dbl>, Tympanic Temperature <dbl>, Patient Positioning ABG <lgl>, Central Venous Pressure <lgl>,
# Delivered Percent O2 <lgl>, Mean Airway Pressure S <lgl>, Minute Volume expired (S) <lgl>, Peak Inspiratory Pressure measured S <lgl>,
# Positive End Expiratory pressure <lgl>, S Expired Tidal vol. (breath) <lgl>, S Tidal Volume Inspired <lgl>, Servo i Modes <lgl>,
# Set FiO2 <lgl>, Set Flow Trigger S <lgl>, Set Pause time % <lgl>, Set PEEP Servo <lgl>, Set rate (CMV or SIMV) <lgl>,
# Set Tidal Volume (servo) <lgl>, Set Upper Pressure Limit <lgl>, Spontaneous Rate S <lgl>, Cardiac output (Vigileo) <lgl>, ...