在 R 中绘制 Liker 变量
Plotting Liker Variables in R
我正在处理一个数据集,其中包含大量关于使用 Public 交通工具进行调查的变量。
附加数据集和 csv 文件。
Data Variables
Link 到数据:https://drive.google.com/open?id=1MvfnwR4IkUyUzSnCuAOL8fxAYiDjBoBi
读取数据集的代码。
df = read.csv("PublicTransportSurvey.csv",sep=";", header = T, stringsAsFactors=TRUE)
# Display the dataset and obtain overall summary of the dataset
df <- subset(df, select = -Row_Num)
View(df)
变量也可以概括为:
这些项目是李克特量表 (1-5),可能的回答是:非常不同意 (1)、不同意 (2)、中立 (3)、同意 (3) 和非常同意 (5)。
Perceived Usefulness and Ease of Use
PU1: PT information is easily accessible
PU2: PT infrastructure is easily accessible
PU3: The maps on PT infrastructure are helpful and clear
PU4: PT tickets are easy to purchase
PU5: PT connections in Adelaide are well integrated
PU6: Waiting times for PT services are reasonable
Perceived Enjoyment
ENJ1: The views from PT in Adelaide are scenic
ENJ2: Fellow passengers on PT in Adelaide are friendly
Quality
QU1: PT in Adelaide is reliable
QU2: PT in Adelaide supports disabled travellers
QU3: PT in Adelaide offers free wi-fi
QU4: PT in Adelaide has a low carbon footprint
QU5: PT in Adelaide is clean
Safety and Security
SS1: PT is safe in Adelaide
SS2: Adelaide PT drivers handle unruly passengers
SS3: PT shelters in Adelaide are well-lit at night-time
Use Behaviour
USE1: I use PT in the mornings only
USE2: I use PT during off-peak times
USE3: I use PT only during the evening
USE4: I use PT during the week
USE5: I use PT at the weekend
PT Incentives
INC1: I use PT to save money
INC2: I use PT to protect the environment
INC3: I use PT to exercise more
INC4: I use PT to experience the city firsthand
Information Access
INF1: I access PT timetables and information using a mobile device
INF2: I access PT timetables and information from a hotel concierge
INF3: I access PT timetables and information on the platform
INF4: I access PT timetables and information from a newsagency
INF5: I access PT timetables and information from other commuters
但是,如果我们看到我附上的数据集变量图片,它还包含一些超出 1-5 的值。
我在过去 4 小时内遇到了这个问题并尝试搜索。
我的终极objective是把上面的变量去掉异常值,(5个以上)然后画李克特图。请有人建议我,如何解决这个问题。
提前致谢。
我的解决方案:
library(likert)
df <- read.csv2("PublicTransportSurvey.csv")
df <- df[,12:54]
df[sapply(df, is.factor)] <- lapply(df[sapply(df, is.factor)], function(x) as.numeric(as.character(x)))
df[sapply(df, is.character)] <- lapply(df[sapply(df, is.character)], function(x) as.numeric(as.character(x)))
df <- data.frame(apply(df, 2, function(x) ifelse(x > 5, NA, x)))
df <- data.frame(lapply(df, function(x) as.factor(x)))
likert_df <- likert(df)
plot(likert_df)
首先,我删除了不是李克特变量的列。然后我将因子和字符列转换为数字列,并将所有大于 5 的值替换为 NA
s,因为据我所知,这些值被 likert
包忽略。
然后我将所有列都转换回因子,因为李克特函数需要这样做。代码生成此图像:
我正在处理一个数据集,其中包含大量关于使用 Public 交通工具进行调查的变量。
附加数据集和 csv 文件。 Data Variables
Link 到数据:https://drive.google.com/open?id=1MvfnwR4IkUyUzSnCuAOL8fxAYiDjBoBi
读取数据集的代码。
df = read.csv("PublicTransportSurvey.csv",sep=";", header = T, stringsAsFactors=TRUE)
# Display the dataset and obtain overall summary of the dataset
df <- subset(df, select = -Row_Num)
View(df)
变量也可以概括为:
这些项目是李克特量表 (1-5),可能的回答是:非常不同意 (1)、不同意 (2)、中立 (3)、同意 (3) 和非常同意 (5)。
Perceived Usefulness and Ease of Use
PU1: PT information is easily accessible
PU2: PT infrastructure is easily accessible
PU3: The maps on PT infrastructure are helpful and clear
PU4: PT tickets are easy to purchase
PU5: PT connections in Adelaide are well integrated
PU6: Waiting times for PT services are reasonable
Perceived Enjoyment
ENJ1: The views from PT in Adelaide are scenic
ENJ2: Fellow passengers on PT in Adelaide are friendly
Quality
QU1: PT in Adelaide is reliable
QU2: PT in Adelaide supports disabled travellers
QU3: PT in Adelaide offers free wi-fi
QU4: PT in Adelaide has a low carbon footprint
QU5: PT in Adelaide is clean
Safety and Security
SS1: PT is safe in Adelaide
SS2: Adelaide PT drivers handle unruly passengers
SS3: PT shelters in Adelaide are well-lit at night-time
Use Behaviour
USE1: I use PT in the mornings only
USE2: I use PT during off-peak times
USE3: I use PT only during the evening
USE4: I use PT during the week
USE5: I use PT at the weekend
PT Incentives
INC1: I use PT to save money
INC2: I use PT to protect the environment
INC3: I use PT to exercise more
INC4: I use PT to experience the city firsthand
Information Access
INF1: I access PT timetables and information using a mobile device
INF2: I access PT timetables and information from a hotel concierge
INF3: I access PT timetables and information on the platform
INF4: I access PT timetables and information from a newsagency
INF5: I access PT timetables and information from other commuters
但是,如果我们看到我附上的数据集变量图片,它还包含一些超出 1-5 的值。
我在过去 4 小时内遇到了这个问题并尝试搜索。
我的终极objective是把上面的变量去掉异常值,(5个以上)然后画李克特图。请有人建议我,如何解决这个问题。
提前致谢。
我的解决方案:
library(likert)
df <- read.csv2("PublicTransportSurvey.csv")
df <- df[,12:54]
df[sapply(df, is.factor)] <- lapply(df[sapply(df, is.factor)], function(x) as.numeric(as.character(x)))
df[sapply(df, is.character)] <- lapply(df[sapply(df, is.character)], function(x) as.numeric(as.character(x)))
df <- data.frame(apply(df, 2, function(x) ifelse(x > 5, NA, x)))
df <- data.frame(lapply(df, function(x) as.factor(x)))
likert_df <- likert(df)
plot(likert_df)
首先,我删除了不是李克特变量的列。然后我将因子和字符列转换为数字列,并将所有大于 5 的值替换为 NA
s,因为据我所知,这些值被 likert
包忽略。
然后我将所有列都转换回因子,因为李克特函数需要这样做。代码生成此图像: