如何使用 magrittr 从数据框中提取单个元素?
how to extract single element from dataframe using magrittr?
这可能是一个简单的问题,但我无法找出答案。考虑这个简单的数据框
library(dplyr)
library(purrr)
library(magrittr)
dataframe <- data_frame(id = c(1,2,3,4),
text = c("this is a this", "this is another",'hello','what???'))
> dataframe
# A tibble: 4 x 2
id text
<dbl> <chr>
1 1 this is a this
2 2 this is another
3 3 hello
4 4 what???
这里我想写一个管道表达式来提取第 4 行和列文本中的元素:what???
.
我尝试使用
dataframe %>% pull(text)[[4]]
但它不起作用。我可以在这里做什么?
你可以试试:
dataframe %>%
filter(row_number() == 4) %>%
pull(text)
这个有效:
dataframe %>% select(text) %>% unlist() %>% .[4]
编辑:
并不是说这真的很重要,但有更快的选择(来自穆迪的列表):
microbenchmark(
dataframe %$% text[4],
dataframe %>% {.$text[4]},
dataframe %>% .[[4,"text"]],
dataframe %>% `[[`(4,"text"),
dataframe %>% extract2(4,"text"),
dataframe %$% text %>% extract(4),
dataframe %>% extract2("text") %>% extract(4),
dataframe %>% use_series(text) %>% extract(4),
dataframe %>% pull(text) %>% .[4], # @andrey-kolyadin in the comments
dataframe %>% select(text) %>% unlist() %>% .[4], # @stackTon's solution
dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution
)
Unit: microseconds
expr min lq mean median uq max neval
dataframe %$% text[4] 49.014 58.0065 74.18069 66.8210 76.5185 256.353 100
dataframe %>% { .$text[4] } 92.739 102.7880 119.06888 112.6615 124.1220 290.205 100
dataframe %>% .[[4, "text"]] 65.235 70.5240 90.02727 79.5155 92.9155 344.507 100
dataframe %>% 4[["text"]] 69.466 76.8710 93.45829 85.6865 101.0250 224.618 100
dataframe %>% extract2(4, "text") 68.761 77.4005 90.49983 82.6890 99.6150 166.789 100
dataframe %$% text %>% extract(4) 81.455 87.6255 108.64541 99.9675 116.3640 332.519 100
dataframe %>% extract2("text") %>% extract(4) 98.733 106.8440 120.75439 114.6010 125.3560 256.000 100
dataframe %>% use_series(text) %>% extract(4) 137.521 147.3940 165.11001 156.7390 172.0780 409.741 100
dataframe %>% pull(text) %>% .[4] 1984.177 2042.0055 2189.99915 2076.0335 2172.6505 5512.815 100
dataframe %>% select(text) %>% unlist() %>% .[4] 3241.256 3362.9095 3644.73124 3425.4990 3567.9555 8855.978 100
dataframe %>% filter(row_number() == 4) %>% pull(text) 3542.039 3635.4820 3941.44085 3767.7140 3980.3415 8704.705 100
我喜欢(不在名单上):
dataframe %>% .$text %>% .[4]
平均 162
对于仅 magrittr
的解决方案,您需要
dataframe %>% magrittr::use_series(text) %>% magrittr::extract(4)
一些简短的可能性:
dataframe %$% text[4]
dataframe %>% {.$text[4]}
dataframe %>% .[[4,"text"]]
dataframe %>% `[[`(4,"text")
如果您只想使用 magrittr
别名,则可以这样做:
dataframe %>% extract2(4,"text")
dataframe %$% text %>% extract(4)
dataframe %>% extract2("text") %>% extract(4)
dataframe %>% use_series(text) %>% extract(4) # @Brian'ssolution
其他提出的解决方案并不纯粹magrittr
(使用dplyr
):
dataframe %>% pull(text) %>% .[4] # @andrey-kolyadin in the comments
dataframe %>% select(text) %>% unlist() %>% .[4] # @stackTon's solution
dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution
这可能是一个简单的问题,但我无法找出答案。考虑这个简单的数据框
library(dplyr)
library(purrr)
library(magrittr)
dataframe <- data_frame(id = c(1,2,3,4),
text = c("this is a this", "this is another",'hello','what???'))
> dataframe
# A tibble: 4 x 2
id text
<dbl> <chr>
1 1 this is a this
2 2 this is another
3 3 hello
4 4 what???
这里我想写一个管道表达式来提取第 4 行和列文本中的元素:what???
.
我尝试使用
dataframe %>% pull(text)[[4]]
但它不起作用。我可以在这里做什么?
你可以试试:
dataframe %>%
filter(row_number() == 4) %>%
pull(text)
这个有效:
dataframe %>% select(text) %>% unlist() %>% .[4]
编辑:
并不是说这真的很重要,但有更快的选择(来自穆迪的列表):
microbenchmark(
dataframe %$% text[4],
dataframe %>% {.$text[4]},
dataframe %>% .[[4,"text"]],
dataframe %>% `[[`(4,"text"),
dataframe %>% extract2(4,"text"),
dataframe %$% text %>% extract(4),
dataframe %>% extract2("text") %>% extract(4),
dataframe %>% use_series(text) %>% extract(4),
dataframe %>% pull(text) %>% .[4], # @andrey-kolyadin in the comments
dataframe %>% select(text) %>% unlist() %>% .[4], # @stackTon's solution
dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution
)
Unit: microseconds
expr min lq mean median uq max neval
dataframe %$% text[4] 49.014 58.0065 74.18069 66.8210 76.5185 256.353 100
dataframe %>% { .$text[4] } 92.739 102.7880 119.06888 112.6615 124.1220 290.205 100
dataframe %>% .[[4, "text"]] 65.235 70.5240 90.02727 79.5155 92.9155 344.507 100
dataframe %>% 4[["text"]] 69.466 76.8710 93.45829 85.6865 101.0250 224.618 100
dataframe %>% extract2(4, "text") 68.761 77.4005 90.49983 82.6890 99.6150 166.789 100
dataframe %$% text %>% extract(4) 81.455 87.6255 108.64541 99.9675 116.3640 332.519 100
dataframe %>% extract2("text") %>% extract(4) 98.733 106.8440 120.75439 114.6010 125.3560 256.000 100
dataframe %>% use_series(text) %>% extract(4) 137.521 147.3940 165.11001 156.7390 172.0780 409.741 100
dataframe %>% pull(text) %>% .[4] 1984.177 2042.0055 2189.99915 2076.0335 2172.6505 5512.815 100
dataframe %>% select(text) %>% unlist() %>% .[4] 3241.256 3362.9095 3644.73124 3425.4990 3567.9555 8855.978 100
dataframe %>% filter(row_number() == 4) %>% pull(text) 3542.039 3635.4820 3941.44085 3767.7140 3980.3415 8704.705 100
我喜欢(不在名单上):
dataframe %>% .$text %>% .[4]
平均 162
对于仅 magrittr
的解决方案,您需要
dataframe %>% magrittr::use_series(text) %>% magrittr::extract(4)
一些简短的可能性:
dataframe %$% text[4]
dataframe %>% {.$text[4]}
dataframe %>% .[[4,"text"]]
dataframe %>% `[[`(4,"text")
如果您只想使用 magrittr
别名,则可以这样做:
dataframe %>% extract2(4,"text")
dataframe %$% text %>% extract(4)
dataframe %>% extract2("text") %>% extract(4)
dataframe %>% use_series(text) %>% extract(4) # @Brian'ssolution
其他提出的解决方案并不纯粹magrittr
(使用dplyr
):
dataframe %>% pull(text) %>% .[4] # @andrey-kolyadin in the comments
dataframe %>% select(text) %>% unlist() %>% .[4] # @stackTon's solution
dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution