我如何将来自 BORIS 的状态数据与 TraMineR 一起使用?
How do I use state data from BORIS with TraMineR?
我正在努力弄清楚如何将 BORIS 输出转换为我可以使用 TraMineR 分析的一种状态序列分析格式。
BORIS 输出基本上是这样的表格:
File Time Behavior Status
1 K8121319_feed3_01 0.000 Approach START
2 K8121319_feed3_01 393.225 Approach STOP
3 K8121319_feed3_01 393.226 Out-of-Frame START
4 K8121319_feed3_01 426.003 Out-of-Frame STOP
5 K8121319_feed3_01 442.006 Approach START
6 K8121319_feed3_01 465.755 Approach STOP
7 K8121319_feed3_01 465.756 Avoid START
8 K8121319_feed3_01 513.255 Avoid STOP
9 K8121319_feed3_01 513.256 Explore START
10 K8121319_feed3_01 746.577 Explore STOP
似乎可以使用 dplyr 转换为 SPELL 序列格式,但我不知道如何操作。有人一起用过这两个软件吗?
SPELL 格式如下所示:
File Behavior Start Stop
1 K8121319_feed3_01 Approach 0.000 393.225
2 K8121319_feed3_01 OOF 393.226 426.003
3 K8121319_feed3_01 Approach 426.006 465.755
4 K8121319_feed3_01 Avoid 465.756 513.255
5 K8121319_feed3_01 Explore 513.256 746.577
我一直在尝试使用 dplyr::spread 来做到这一点。
编辑:这里是 dput(data1[1:20,])
的结果
structure(list(File = c("K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02"
), Time = c(0, 393.225, 393.226, 426.003, 442.006, 465.755, 465.756,
513.255, 513.256, 746.577, 0, 29.85, 29.851, 66.6, 66.601, 292.646,
292.647, 362.208, 362.209, 442.456), Behavior = c("Approach",
"Approach", "Out-of-Frame", "Out-of-Frame", "Approach", "Approach",
"Avoid", "Avoid", "Explore", "Explore", "Approach", "Approach",
"Avoid", "Avoid", "Approach", "Approach", "Avoid", "Avoid", "Approach",
"Approach"), Status = c("START", "STOP", "START", "STOP", "START",
"STOP", "START", "STOP", "START", "STOP", "START", "STOP", "START",
"STOP", "START", "STOP", "START", "STOP", "START", "STOP")), row.names = c(NA,
20L), class = "data.frame")
编辑:具有重复状态的部分 df 的 dput
dput(data1[360:370,])
structure(list(File = c("K8121819_feed3_13", "K8121819_feed3_13",
"K8121819_feed3_13", "K8121819_feed3_13", "K8121819_feed3_13",
"K8121819_feed3_14", "K8121819_feed3_14", "K8121819_feed3_14",
"K8121819_feed3_14", "K8121819_feed3_14", "K8121819_feed3_14"
), Time = c(700.311, 700.312, 720.311, 742.851, 754.339, 0, 32.124,
32.125, 47.14, 47.141, 84.671), Behavior = c("Approach", "Avoid",
"Avoid", "Avoid", "Avoid", "Avoid", "Avoid", "Explore", "Explore",
"Approach", "Approach"), Status = c("STOP", "START", "STOP",
"START", "STOP", "START", "STOP", "START", "STOP", "START", "STOP"
)), row.names = 360:370, class = "data.frame")
我质疑你关于 SPELL 格式可用于连续数据的说法,因为向 seqdef
提供双精度数会导致开始和结束列必须为整数的错误。
希望这能让你入门:
编辑:现在可能修复重复的行为状态:
library(TraMineR)
library(tidyverse)
library(data.table)
data.long <- data1 %>%
mutate(id = rleid(Behavior),
Behavior = str_replace_all(Behavior,pattern = "-", replacement = "")) %>%
group_by(File,id) %>%
dplyr::filter(Time == min(Time) | Time == max(Time)) %>%
pivot_wider(id_cols = c("File","Behavior", "id"),
names_from = "Status",
values_from = "Time") %>%
mutate(START = 1L+as.integer(floor(START)),
STOP = 1L+as.integer(floor(STOP))) %>%
as.data.frame()
data.long
# File Behavior id START STOP
#1 K8121319_feed3_01 Approach 1 1 394
#2 K8121319_feed3_01 OutofFrame 2 394 427
#3 K8121319_feed3_01 Approach 3 443 466
#4 K8121319_feed3_01 Avoid 4 466 514
#5 K8121319_feed3_01 Explore 5 514 747
#6 K8121319_feed3_02 Approach 6 1 30
#7 K8121319_feed3_02 Avoid 7 30 67
#8 K8121319_feed3_02 Approach 8 67 293
#9 K8121319_feed3_02 Avoid 9 293 363
#10 K8121319_feed3_02 Approach 10 363 443
我删除了 -
,因为它导致了 seqstatl
的问题,我添加了 1,因为显然包作者认为 0 是不允许的。我使用了 data.table
包中的 rleid
因为它节省了很多尝试使用基本 R 的 rle
.
的输入
现在我们可以使用 seqdef
:
data.SPELL <- seqdef(data = data.long,
var = c("File", "START", "STOP", "Behavior"),
informat = "SPELL",
labels = seqstatl(data.long$Behavior),
states = seq_along(seqstatl(data.long$Behavior)),
process = FALSE)
data.SPELL
#K8121319_feed3_01 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3
#K8121319_feed3_02 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1
我正在努力弄清楚如何将 BORIS 输出转换为我可以使用 TraMineR 分析的一种状态序列分析格式。
BORIS 输出基本上是这样的表格:
File Time Behavior Status
1 K8121319_feed3_01 0.000 Approach START
2 K8121319_feed3_01 393.225 Approach STOP
3 K8121319_feed3_01 393.226 Out-of-Frame START
4 K8121319_feed3_01 426.003 Out-of-Frame STOP
5 K8121319_feed3_01 442.006 Approach START
6 K8121319_feed3_01 465.755 Approach STOP
7 K8121319_feed3_01 465.756 Avoid START
8 K8121319_feed3_01 513.255 Avoid STOP
9 K8121319_feed3_01 513.256 Explore START
10 K8121319_feed3_01 746.577 Explore STOP
似乎可以使用 dplyr 转换为 SPELL 序列格式,但我不知道如何操作。有人一起用过这两个软件吗?
SPELL 格式如下所示:
File Behavior Start Stop
1 K8121319_feed3_01 Approach 0.000 393.225
2 K8121319_feed3_01 OOF 393.226 426.003
3 K8121319_feed3_01 Approach 426.006 465.755
4 K8121319_feed3_01 Avoid 465.756 513.255
5 K8121319_feed3_01 Explore 513.256 746.577
我一直在尝试使用 dplyr::spread 来做到这一点。
编辑:这里是 dput(data1[1:20,])
的结果structure(list(File = c("K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02"
), Time = c(0, 393.225, 393.226, 426.003, 442.006, 465.755, 465.756,
513.255, 513.256, 746.577, 0, 29.85, 29.851, 66.6, 66.601, 292.646,
292.647, 362.208, 362.209, 442.456), Behavior = c("Approach",
"Approach", "Out-of-Frame", "Out-of-Frame", "Approach", "Approach",
"Avoid", "Avoid", "Explore", "Explore", "Approach", "Approach",
"Avoid", "Avoid", "Approach", "Approach", "Avoid", "Avoid", "Approach",
"Approach"), Status = c("START", "STOP", "START", "STOP", "START",
"STOP", "START", "STOP", "START", "STOP", "START", "STOP", "START",
"STOP", "START", "STOP", "START", "STOP", "START", "STOP")), row.names = c(NA,
20L), class = "data.frame")
编辑:具有重复状态的部分 df 的 dput
dput(data1[360:370,])
structure(list(File = c("K8121819_feed3_13", "K8121819_feed3_13",
"K8121819_feed3_13", "K8121819_feed3_13", "K8121819_feed3_13",
"K8121819_feed3_14", "K8121819_feed3_14", "K8121819_feed3_14",
"K8121819_feed3_14", "K8121819_feed3_14", "K8121819_feed3_14"
), Time = c(700.311, 700.312, 720.311, 742.851, 754.339, 0, 32.124,
32.125, 47.14, 47.141, 84.671), Behavior = c("Approach", "Avoid",
"Avoid", "Avoid", "Avoid", "Avoid", "Avoid", "Explore", "Explore",
"Approach", "Approach"), Status = c("STOP", "START", "STOP",
"START", "STOP", "START", "STOP", "START", "STOP", "START", "STOP"
)), row.names = 360:370, class = "data.frame")
我质疑你关于 SPELL 格式可用于连续数据的说法,因为向 seqdef
提供双精度数会导致开始和结束列必须为整数的错误。
希望这能让你入门:
编辑:现在可能修复重复的行为状态:
library(TraMineR)
library(tidyverse)
library(data.table)
data.long <- data1 %>%
mutate(id = rleid(Behavior),
Behavior = str_replace_all(Behavior,pattern = "-", replacement = "")) %>%
group_by(File,id) %>%
dplyr::filter(Time == min(Time) | Time == max(Time)) %>%
pivot_wider(id_cols = c("File","Behavior", "id"),
names_from = "Status",
values_from = "Time") %>%
mutate(START = 1L+as.integer(floor(START)),
STOP = 1L+as.integer(floor(STOP))) %>%
as.data.frame()
data.long
# File Behavior id START STOP
#1 K8121319_feed3_01 Approach 1 1 394
#2 K8121319_feed3_01 OutofFrame 2 394 427
#3 K8121319_feed3_01 Approach 3 443 466
#4 K8121319_feed3_01 Avoid 4 466 514
#5 K8121319_feed3_01 Explore 5 514 747
#6 K8121319_feed3_02 Approach 6 1 30
#7 K8121319_feed3_02 Avoid 7 30 67
#8 K8121319_feed3_02 Approach 8 67 293
#9 K8121319_feed3_02 Avoid 9 293 363
#10 K8121319_feed3_02 Approach 10 363 443
我删除了 -
,因为它导致了 seqstatl
的问题,我添加了 1,因为显然包作者认为 0 是不允许的。我使用了 data.table
包中的 rleid
因为它节省了很多尝试使用基本 R 的 rle
.
现在我们可以使用 seqdef
:
data.SPELL <- seqdef(data = data.long,
var = c("File", "START", "STOP", "Behavior"),
informat = "SPELL",
labels = seqstatl(data.long$Behavior),
states = seq_along(seqstatl(data.long$Behavior)),
process = FALSE)
data.SPELL
#K8121319_feed3_01 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-4-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3
#K8121319_feed3_02 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1