如何用 spss 文件中 data.frames 中的标签替换值?
How to replace values by labels in data.frames from spss files?
我必须读取一个 sav 文件
我使用包 haven
library(haven)
dataset<- read_sav("datafile.sav")
在控制台中我可以看到标签:
dput(head(voyages$portdep))
structure(c(50422, 50299, 50299, 50299, NA, NA), label = "Port of departure", labels = c(Alicante = 10101,
Barcelona = 10102, Bilbao = 10103, Cadiz = 10104, Figuera = 10105,
Gibraltar = 10106, `La Coruña` = 10107, Santander = 10110, Seville = 10111,
`San Lucar` = 10112, Vigo = 10113, `Spain, port unspecified` = 10199,
Lagos = 10202, Lisbon = 10203, Oporto = 10204, `Ilho do Fayal` = 10205,
Setubal = 10206, `Portugal, port unspecified` = 10299, `Great Britain, port unspecified` = 10399,
Barmouth = 10401, Bideford = 10402, Birkenhead = 10403, Bristol = 10404,
Brixham = 10405, Broadstairs = 10406, Cawsand = 10407, Chepstow = 10408,
Chester = 10409, Colchester = 10410, Cowes = 10411, Dartmouth = 10412,
Deptford = 10413, Dover = 10414, Exeter = 10415, Folkstone = 10416,
Frodsham = 10417, Gainsborough = 10418, Greenwich = 10419, Guernsey = 10420,
Harwich = 10421, Hull = 10422, Ilfracombe = 10423, Ipswich = 10424,
`Isle of Man` = 10425, `Isle of Wight` = 10426, Jersey = 10427,
Kendal = 10428, `King's Lynn` = 10429, Lancaster = 10430, Lindale = 10431,
Liverpool = 10432, London = 10433, Lyme = 10434, Maryport = 10436,
`Milford Haven` = 10437, `New Shoreham` = 10438, `Newcastle upon Tyne` = 10439,
Newnham = 10440, `North Shields` = 10441, Norwich = 10443, Padstowe = 10444,
Parkgate = 10445, `Piel of Foulney` = 10446, Plymouth = 10447,
Poole = 10448, Portsery = 10449, Portsmouth = 10450, Poulton = 10451,
Preston = 10452, Ramsgate = 10453, Ravenglass = 10454, `River Thames` = 10455,
Rochester = 10456, Rotherhithe = 10457, Rye = 10458, Scarborough = 10459,
Sheerness = 10460, Shields = 10461, Shoreham = 10462, Sidmouth = 10463,
Southampton = 10464, Stockton = 10466, Stockwithe = 10467, Sunderland = 10468,
Teignmouth = 10469, Topsham = 10470, Torbay = 10471, Wales = 10472,
在 html table 中,我只有值 :
如何用 spss 文件中 data.frames 中的标签替换值?以便在 html table 中显示?
使用 sjlabelled 包,我可以获得任何列的标签:
library(sjlabelled)
get_labels(voyages$portdep)
1]“阿利坎特”“巴塞罗那”“毕尔巴鄂”“加的斯”
[5]“菲格拉”“直布罗陀”“拉科鲁尼亚”“桑坦德”
[9]“塞维利亚”“圣卢卡”“维戈”“西班牙,未指定港口”
[13] “拉各斯” “里斯本” “波尔图” “Ilho do Fayal”
[17] "Setubal" "Portugal, port unspecified" "Great Britain, port unspecified" "Barmouth"
[21] “比迪福德” “伯肯黑德” “布里斯托尔” “布里克瑟姆”
[25] “Broadstairs” “Cawsand” “Chepstow” “Chester”
[29]“科尔切斯特”“考斯”“达特茅斯”“德普特福德”
[33]“多佛”“埃克塞特”“福克斯通”“弗罗德舍姆”
[37] “庚斯博罗” “格林威治” “根西岛” “哈里奇”
[41]“赫尔”“伊尔弗勒科姆”“伊普斯维奇”“马恩岛”
[45]“怀特岛”“泽西岛”“肯德尔”“金斯林”
我试过了:
在单列上:
dataset2 <- dataset %>% mutate(portdep = get_labels(portdep))
Erreur : Column portdep
must be length 36002 (the number of rows) or
one, not 847
在所有数据帧上:
dataset2 <- dataset %>% mutate_all(funs(get_labels(.)))
第一列有同样的错误:
Column xxx must be length 36002 (the number of rows) or one, not 2
我认为您可以使用 haven::as_factor
.
找到您想要的东西
这个有用吗?
library(haven)
library(dplyr)
dataset %>%
mutate_all(as_factor) %>%
head() %>%
View()
您可以尝试 foreign
,而不是使用 haven
包。我使用了自己的数据 try.sav
,包括一个变量 gender
:
library(haven)
df_haven<- read_sav("try.sav")
class(df_haven$gender)
#> [1] "haven_labelled"
table(df_haven$gender)
#>
#> 1 2
#> 1972 2417
df_haven$gender
#> <Labelled double>: Gender
#> [1] 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2
#> [38] 2 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [75] 2 2 1 1 2 2 1 2 1 2 1 1 2 1 2 1 1 2 2 2 2 2 1 1 2 2 1 2 1 2 2 2 1 1 2 2 1
#> ...
#> Labels:
#> value label
#> 1 male
#> 2 female
library(foreign)
df_foreign<- read.spss("try.sav", to.data.frame = TRUE)
#> re-encoding from UTF-8
class(df_foreign$gender)
#> [1] "factor"
table(df_foreign$gender)
#>
#> male female
#> 1972 2417
df_foreign$gender
#> [1] female female female male female female female female female female
#> [11] female female female male female female female female female male
#> [21] female female female female female female male male male male
#> [31] female female female female female female female female female female
#> [41] male female female male female female female female female female
#> [51] female female male female female female female male female female
#> [61] female female female female female female female female female female
#> [71] female female female female female female male male female female
#> [81] male female male female male male female male female male
#> [91] male female female female female female male male female female
....
#> Levels: male female
由 reprex package (v0.3.0)
于 2020 年 1 月 6 日创建
您也可以使用 haven 包中的 as_factor()。
library(haven)
as_factor(df_foreign$gender)
应该可以!祝你好运!
我必须读取一个 sav 文件
我使用包 haven
library(haven)
dataset<- read_sav("datafile.sav")
在控制台中我可以看到标签:
dput(head(voyages$portdep))
structure(c(50422, 50299, 50299, 50299, NA, NA), label = "Port of departure", labels = c(Alicante = 10101,
Barcelona = 10102, Bilbao = 10103, Cadiz = 10104, Figuera = 10105,
Gibraltar = 10106, `La Coruña` = 10107, Santander = 10110, Seville = 10111,
`San Lucar` = 10112, Vigo = 10113, `Spain, port unspecified` = 10199,
Lagos = 10202, Lisbon = 10203, Oporto = 10204, `Ilho do Fayal` = 10205,
Setubal = 10206, `Portugal, port unspecified` = 10299, `Great Britain, port unspecified` = 10399,
Barmouth = 10401, Bideford = 10402, Birkenhead = 10403, Bristol = 10404,
Brixham = 10405, Broadstairs = 10406, Cawsand = 10407, Chepstow = 10408,
Chester = 10409, Colchester = 10410, Cowes = 10411, Dartmouth = 10412,
Deptford = 10413, Dover = 10414, Exeter = 10415, Folkstone = 10416,
Frodsham = 10417, Gainsborough = 10418, Greenwich = 10419, Guernsey = 10420,
Harwich = 10421, Hull = 10422, Ilfracombe = 10423, Ipswich = 10424,
`Isle of Man` = 10425, `Isle of Wight` = 10426, Jersey = 10427,
Kendal = 10428, `King's Lynn` = 10429, Lancaster = 10430, Lindale = 10431,
Liverpool = 10432, London = 10433, Lyme = 10434, Maryport = 10436,
`Milford Haven` = 10437, `New Shoreham` = 10438, `Newcastle upon Tyne` = 10439,
Newnham = 10440, `North Shields` = 10441, Norwich = 10443, Padstowe = 10444,
Parkgate = 10445, `Piel of Foulney` = 10446, Plymouth = 10447,
Poole = 10448, Portsery = 10449, Portsmouth = 10450, Poulton = 10451,
Preston = 10452, Ramsgate = 10453, Ravenglass = 10454, `River Thames` = 10455,
Rochester = 10456, Rotherhithe = 10457, Rye = 10458, Scarborough = 10459,
Sheerness = 10460, Shields = 10461, Shoreham = 10462, Sidmouth = 10463,
Southampton = 10464, Stockton = 10466, Stockwithe = 10467, Sunderland = 10468,
Teignmouth = 10469, Topsham = 10470, Torbay = 10471, Wales = 10472,
在 html table 中,我只有值 :
如何用 spss 文件中 data.frames 中的标签替换值?以便在 html table 中显示?
使用 sjlabelled 包,我可以获得任何列的标签:
library(sjlabelled)
get_labels(voyages$portdep)
1]“阿利坎特”“巴塞罗那”“毕尔巴鄂”“加的斯”
[5]“菲格拉”“直布罗陀”“拉科鲁尼亚”“桑坦德”
[9]“塞维利亚”“圣卢卡”“维戈”“西班牙,未指定港口”
[13] “拉各斯” “里斯本” “波尔图” “Ilho do Fayal”
[17] "Setubal" "Portugal, port unspecified" "Great Britain, port unspecified" "Barmouth"
[21] “比迪福德” “伯肯黑德” “布里斯托尔” “布里克瑟姆”
[25] “Broadstairs” “Cawsand” “Chepstow” “Chester”
[29]“科尔切斯特”“考斯”“达特茅斯”“德普特福德”
[33]“多佛”“埃克塞特”“福克斯通”“弗罗德舍姆”
[37] “庚斯博罗” “格林威治” “根西岛” “哈里奇”
[41]“赫尔”“伊尔弗勒科姆”“伊普斯维奇”“马恩岛”
[45]“怀特岛”“泽西岛”“肯德尔”“金斯林”
我试过了:
在单列上:
dataset2 <- dataset %>% mutate(portdep = get_labels(portdep))
Erreur : Column
portdep
must be length 36002 (the number of rows) or one, not 847
在所有数据帧上:
dataset2 <- dataset %>% mutate_all(funs(get_labels(.)))
第一列有同样的错误:
Column xxx must be length 36002 (the number of rows) or one, not 2
我认为您可以使用 haven::as_factor
.
这个有用吗?
library(haven)
library(dplyr)
dataset %>%
mutate_all(as_factor) %>%
head() %>%
View()
您可以尝试 foreign
,而不是使用 haven
包。我使用了自己的数据 try.sav
,包括一个变量 gender
:
library(haven)
df_haven<- read_sav("try.sav")
class(df_haven$gender)
#> [1] "haven_labelled"
table(df_haven$gender)
#>
#> 1 2
#> 1972 2417
df_haven$gender
#> <Labelled double>: Gender
#> [1] 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2
#> [38] 2 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [75] 2 2 1 1 2 2 1 2 1 2 1 1 2 1 2 1 1 2 2 2 2 2 1 1 2 2 1 2 1 2 2 2 1 1 2 2 1
#> ...
#> Labels:
#> value label
#> 1 male
#> 2 female
library(foreign)
df_foreign<- read.spss("try.sav", to.data.frame = TRUE)
#> re-encoding from UTF-8
class(df_foreign$gender)
#> [1] "factor"
table(df_foreign$gender)
#>
#> male female
#> 1972 2417
df_foreign$gender
#> [1] female female female male female female female female female female
#> [11] female female female male female female female female female male
#> [21] female female female female female female male male male male
#> [31] female female female female female female female female female female
#> [41] male female female male female female female female female female
#> [51] female female male female female female female male female female
#> [61] female female female female female female female female female female
#> [71] female female female female female female male male female female
#> [81] male female male female male male female male female male
#> [91] male female female female female female male male female female
....
#> Levels: male female
由 reprex package (v0.3.0)
于 2020 年 1 月 6 日创建您也可以使用 haven 包中的 as_factor()。
library(haven)
as_factor(df_foreign$gender)
应该可以!祝你好运!