R将原始数据转换为字符

R convert raw data to character

我尝试将 mongodb 中的 R 加载数据与包 "mongolite" 一起使用,代码如下:

df <- db$find('{}', '{"CurrentId":1,"_id":0}')

其中我要提取集合的"CurrentId",变量"CurrentId"是mongodb中的ObjectId,其中可能包含多个ObjectId。

df 看起来像这样:

[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9

[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5

[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de

df[[6]][[1]]是:

 [1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9

typeof(df[[6]][[1]]) 的类型是:

 [1] "raw"

我使用paste(dc3[[6]][[1]],collapse = '')将原始类型转换为字符串,就像mongodb ObjectId格式:

 [1] "56cd5f02b89b5bd026cb39c9"

然后我尝试像上面那样将 df 中的所有原始数据转换为 string。所以我使用 sapply 函数:

sapply(df, function(x) paste(as.character(x),collapse = ''))

得到这个:

[1] ""                                                                                                                                                                                                                                                   
[2] ""                                                                                                                                                                                                                                                   
[3] ""                                                                                                                                                                                                                                                   
[4] ""                                                                                                                                                                                                                                                   
[5] ""                                                                                                                                                                                                                                                   
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"

但我想得到这样的东西:

[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"

[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"

[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"

有人知道怎么处理吗?有没有更有效的方法来完成整个工作?

更新:

我应该提供一些代码来重现我的原始数据集:

test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9")))
df = lapply(1:10,function(x) test)

虽然这段代码产生了这个:

[[1]]
list()

[[2]]
[[2]][[1]]
[1] 5f

[[2]][[2]]
[1] d0


[[3]]
[[3]][[1]]
[1] 26

[[3]][[2]]
[1] 56


[[4]]
list()

[[5]]
[[5]][[1]]
[1] cb


[[6]]
list()

它不像原来的df,但我真的不知道如何将原始数据粘贴到嵌套列表中,希望这对您有所帮助!

sapply(df, function(x) paste(x,collapse = ''))的结果是这样的:

[1] ""                                                                                                                                                                                                                                                   
[2] ""                                                                                                                                                                                                                                                   
[3] ""                                                                                                                                                                                                                                                   
[4] ""                                                                                                                                                                                                                                                   
[5] ""                                                                                                                                                                                                                                                   
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"

只需使用 paste(),而无需在 sapply() 调用中调用 as.character()。 简短示例:

convertRaw = function(x) paste(x,collapse = '') # works identical in sapply
test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9"))) # line copied from your sample
convertRaw(test)
[1] "56cd5f02b89b5bd026cb39c9"

更新 实际上还有另一个问题是由使用嵌套列表引起的。由于您处理的是嵌套列表,因此您的 sapply 调用也需要嵌套。您可以即通过 lapply() 调用它。这是一个简短的例子,希望最终能解决您的问题:

test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9")))
testList = list(list(),list(test,test)) # here I create a short nested list
res = lapply(testList,function(y) sapply(y,function(x) paste(x,collapse = '')))
print(res) 

结果是:

[[1]] list() 

[[2]] [1] "56cd5f02b89b5bd026cb39c9" "56cd5f02b89b5bd026cb39c9"

如果你喜欢这个:

[[1]] list()

[[2]] [[2]][[1]] 
[1] "56cd5f02b89b5bd026cb39c9"

[[2]][[2]] 
[1] "56cd5f02b89b5bd026cb39c9"

直接调用,lapply()嵌套:

lapply(testList,function(y) lapply(y,function(x) paste(x,collapse = '')))