如何从观星者列表中获取数据框
How to get data frame from stargazer list
我想从列表对象中获取每个列表项的数据框。下面是一个例子
mtcars_sumstat <- mtcars %>%
select(mpg:qsec,am) %>%
as.data.frame() %>%
split(.$am) %>%
walk(~stargazer(.,type = "text",
summary.stat = c ("n", "mean", "sd")))
mtcars_sumstat
=============================
Statistic N Mean St. Dev.
-----------------------------
mpg 19 17.147 3.834
cyl 19 6.947 1.545
disp 19 290.379 110.172
hp 19 160.263 53.908
drat 19 3.286 0.392
wt 19 3.769 0.777
qsec 19 18.183 1.751
am 19 0.000 0.000
-----------------------------
=============================
Statistic N Mean St. Dev.
-----------------------------
mpg 13 24.392 6.167
cyl 13 5.077 1.553
disp 13 143.531 87.204
hp 13 126.846 84.062
drat 13 4.050 0.364
wt 13 2.411 0.617
qsec 13 17.360 1.792
am 13 1.000 0.000
-----------------------------
当我运行下面的代码时,我得到了两个列表项的两个数据框;但是,那些数据帧包含实际数据,而不是上面的汇总统计数据。
list2env(mtcars_sumstat,.GlobalEnv)
基本上,我希望将上述汇总统计数据放在两个单独的数据框中,作为“GLOBAL ENVIRONMENT”中的数据框 OBJECT。有什么办法可以做到这一点?数据框应该像这样作为数据框 OBJECT -
Statistic N Mean St. Dev.
mpg 13 24.392 6.167
cyl 13 5.077 1.553
disp 13 143.531 87.204
hp 13 126.846 84.062
drat 13 4.050 0.364
wt 13 2.411 0.617
qsec 13 17.360 1.792
am 13 1.000 0.000
更新 根据 OP 评论
仅使用标准 dplyr
操作来获取汇总统计信息可能更容易。使用 pivot_longer
、group_by
和 summarise
:
mtcars %>%
select(mpg:qsec, am) %>%
pivot_longer(-am) %>%
group_by(am, name) %>%
summarise(across(value, .fns=list(mean = mean, sd = sd, n = length), .names = "{fn}")) %>%
group_split()
输出:
$`0`
# A tibble: 7 x 5
# Groups: am [1]
am name mean sd n
<dbl> <chr> <dbl> <dbl> <int>
1 0 cyl 6.95 1.54 19
2 0 disp 290. 110. 19
3 0 drat 3.29 0.392 19
4 0 hp 160. 53.9 19
5 0 mpg 17.1 3.83 19
6 0 qsec 18.2 1.75 19
7 0 wt 3.77 0.777 19
$`1`
# A tibble: 7 x 5
# Groups: am [1]
am name mean sd n
<dbl> <chr> <dbl> <dbl> <int>
1 1 cyl 5.08 1.55 13
2 1 disp 144. 87.2 13
3 1 drat 4.05 0.364 13
4 1 hp 127. 84.1 13
5 1 mpg 24.4 6.17 13
6 1 qsec 17.4 1.79 13
7 1 wt 2.41 0.617 13
上一个回答
将walk
更改为map
,并添加map(tibble)
:
mtcars_sumstat <- mtcars %>%
select(mpg:qsec,am) %>%
as.data.frame() %>%
split(.$am) %>%
map_df(~stargazer(.,type = "text", summary.stat = c ("n", "mean", "sd"))) %>%
map(tibble)
输出:
mtcars_sumstat
$`0`
# A tibble: 13 x 1
`<chr>`
<chr>
1 ""
2 "============================="
3 "Statistic N Mean St. Dev."
4 "-----------------------------"
5 "mpg 19 17.147 3.834 "
6 "cyl 19 6.947 1.545 "
7 "disp 19 290.379 110.172 "
8 "hp 19 160.263 53.908 "
9 "drat 19 3.286 0.392 "
10 "wt 19 3.769 0.777 "
11 "qsec 19 18.183 1.751 "
12 "am 19 0.000 0.000 "
13 "-----------------------------"
$`1`
# A tibble: 13 x 1
`<chr>`
<chr>
1 ""
2 "============================="
3 "Statistic N Mean St. Dev."
4 "-----------------------------"
5 "mpg 13 24.392 6.167 "
6 "cyl 13 5.077 1.553 "
7 "disp 13 143.531 87.204 "
8 "hp 13 126.846 84.062 "
9 "drat 13 4.050 0.364 "
10 "wt 13 2.411 0.617 "
11 "qsec 13 17.360 1.792 "
12 "am 13 1.000 0.000 "
13 "-----------------------------"
我想从列表对象中获取每个列表项的数据框。下面是一个例子
mtcars_sumstat <- mtcars %>%
select(mpg:qsec,am) %>%
as.data.frame() %>%
split(.$am) %>%
walk(~stargazer(.,type = "text",
summary.stat = c ("n", "mean", "sd")))
mtcars_sumstat
=============================
Statistic N Mean St. Dev.
-----------------------------
mpg 19 17.147 3.834
cyl 19 6.947 1.545
disp 19 290.379 110.172
hp 19 160.263 53.908
drat 19 3.286 0.392
wt 19 3.769 0.777
qsec 19 18.183 1.751
am 19 0.000 0.000
-----------------------------
=============================
Statistic N Mean St. Dev.
-----------------------------
mpg 13 24.392 6.167
cyl 13 5.077 1.553
disp 13 143.531 87.204
hp 13 126.846 84.062
drat 13 4.050 0.364
wt 13 2.411 0.617
qsec 13 17.360 1.792
am 13 1.000 0.000
-----------------------------
当我运行下面的代码时,我得到了两个列表项的两个数据框;但是,那些数据帧包含实际数据,而不是上面的汇总统计数据。
list2env(mtcars_sumstat,.GlobalEnv)
基本上,我希望将上述汇总统计数据放在两个单独的数据框中,作为“GLOBAL ENVIRONMENT”中的数据框 OBJECT。有什么办法可以做到这一点?数据框应该像这样作为数据框 OBJECT -
Statistic N Mean St. Dev.
mpg 13 24.392 6.167
cyl 13 5.077 1.553
disp 13 143.531 87.204
hp 13 126.846 84.062
drat 13 4.050 0.364
wt 13 2.411 0.617
qsec 13 17.360 1.792
am 13 1.000 0.000
更新 根据 OP 评论
仅使用标准 dplyr
操作来获取汇总统计信息可能更容易。使用 pivot_longer
、group_by
和 summarise
:
mtcars %>%
select(mpg:qsec, am) %>%
pivot_longer(-am) %>%
group_by(am, name) %>%
summarise(across(value, .fns=list(mean = mean, sd = sd, n = length), .names = "{fn}")) %>%
group_split()
输出:
$`0`
# A tibble: 7 x 5
# Groups: am [1]
am name mean sd n
<dbl> <chr> <dbl> <dbl> <int>
1 0 cyl 6.95 1.54 19
2 0 disp 290. 110. 19
3 0 drat 3.29 0.392 19
4 0 hp 160. 53.9 19
5 0 mpg 17.1 3.83 19
6 0 qsec 18.2 1.75 19
7 0 wt 3.77 0.777 19
$`1`
# A tibble: 7 x 5
# Groups: am [1]
am name mean sd n
<dbl> <chr> <dbl> <dbl> <int>
1 1 cyl 5.08 1.55 13
2 1 disp 144. 87.2 13
3 1 drat 4.05 0.364 13
4 1 hp 127. 84.1 13
5 1 mpg 24.4 6.17 13
6 1 qsec 17.4 1.79 13
7 1 wt 2.41 0.617 13
上一个回答
将walk
更改为map
,并添加map(tibble)
:
mtcars_sumstat <- mtcars %>%
select(mpg:qsec,am) %>%
as.data.frame() %>%
split(.$am) %>%
map_df(~stargazer(.,type = "text", summary.stat = c ("n", "mean", "sd"))) %>%
map(tibble)
输出:
mtcars_sumstat
$`0`
# A tibble: 13 x 1
`<chr>`
<chr>
1 ""
2 "============================="
3 "Statistic N Mean St. Dev."
4 "-----------------------------"
5 "mpg 19 17.147 3.834 "
6 "cyl 19 6.947 1.545 "
7 "disp 19 290.379 110.172 "
8 "hp 19 160.263 53.908 "
9 "drat 19 3.286 0.392 "
10 "wt 19 3.769 0.777 "
11 "qsec 19 18.183 1.751 "
12 "am 19 0.000 0.000 "
13 "-----------------------------"
$`1`
# A tibble: 13 x 1
`<chr>`
<chr>
1 ""
2 "============================="
3 "Statistic N Mean St. Dev."
4 "-----------------------------"
5 "mpg 13 24.392 6.167 "
6 "cyl 13 5.077 1.553 "
7 "disp 13 143.531 87.204 "
8 "hp 13 126.846 84.062 "
9 "drat 13 4.050 0.364 "
10 "wt 13 2.411 0.617 "
11 "qsec 13 17.360 1.792 "
12 "am 13 1.000 0.000 "
13 "-----------------------------"