将直方图转换为 R 中的密度图
Convert histogram to density graph in R
我用编程语言 R 生成了以下直方图
#subquestion C
total_2016<-qres2016+qres2_2016
total_2017<-qres2017+qres2_2017
total_2018<-qres2018+qres2_2018
total_2019<-qres2019+qres2_2019
total_2020<-qres2020+qres2_2020
year=c("2016","2017","2018","2019","2020")
contribution=c(total_2016,total_2017,total_2018,total_2019,total_2020)
df = data.frame(year,contribution)
require(scales)
ggplot(df,aes(year,contribution)) + geom_bar(stat="identity",fill=colors()[128]) + ggtitle("Histogram chart showing the total number of articles per year for both diseases")+scale_y_continuous(labels=comma)
我也想将图形生成为密度图。但是,当我用 geom_density 替换 geom_bar 时,我收到以下错误消息
Groups with fewer than two data points have been dropped.
并且没有绘制任何内容。我做错了什么?
编辑:我应该提到 qres2016 和其他变量都是整数。我得到它们的方式如下:
res2016<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2016,maxdate=2016,retmax=500)
res2017<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2017,maxdate=2017,retmax=500)
res2018<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2018,maxdate=2018,retmax=500)
res2019<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2019,maxdate=2019,retmax=500)
res2020<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2020,maxdate=2020,retmax=500)
qres2016<-QueryCount(res2016) #counting results
qres2017<-QueryCount(res2017) #counting results
qres2018<-QueryCount(res2018) #counting results
qres2019<-QueryCount(res2019) #counting results
qres2020<-QueryCount(res2020) #counting results
a<- "Total number of articles the last five years for M434: "
qrestotal<-qres2016+qres2017+qres2018+qres2019+qres2020
print(paste(a,qrestotal))
#searching pubmed for second disease
res2_2016<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2016,maxdate=2016,retmax=500)
res2_2017<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2017,maxdate=2017,retmax=500)
res2_2018<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2018,maxdate=2018,retmax=500)
res2_2019<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2019,maxdate=2019,retmax=500)
res2_2020<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2020,maxdate=2020,retmax=500)
qres2_2016<-QueryCount(res2_2016) #counting results
qres2_2017<-QueryCount(res2_2017) #counting results
qres2_2018<-QueryCount(res2_2018) #counting results
qres2_2019<-QueryCount(res2_2019) #counting results
qres2_2020<-QueryCount(res2_2020) #counting results
我需要做的是在密度图中绘制每年的文章数量,但是出现了我上面提到的错误
我试图制作的情节如下所示:
密度图是一种将 x 轴上离散事件的密度显示为 y 轴上的平滑值的方法。您有年度计数,这不适用于密度图。可能最接近的等价物是有一个平滑的面积图。然而,为了公平地做到这一点,您必须对 2020 年的数据进行年度化,否则它不会准确反映出版率。
我认为这与您将要获得的结果差不多:
df$year <- as.numeric(as.character(df$year))
df$annualized_counts <- df$contribution
df$annualized_counts[5] <- df$contribution[5] * 366/lubridate::yday(lubridate::now())
plot_df <- data.frame(year = seq(2016, 2020, length.out = 100),
counts = spline(df$annualized_counts, n = 100)$y)
ggplot(plot_df, aes(year, counts)) +
geom_area(colour = "forestgreen", fill = "forestgreen", alpha = 0.4) +
geom_point(data = df, aes(x = year, y = annualized_counts)) +
labs(title = "Annualized citation count for typhoid meningitis and rickets",
y = "Annualized citations")
我用编程语言 R 生成了以下直方图
#subquestion C
total_2016<-qres2016+qres2_2016
total_2017<-qres2017+qres2_2017
total_2018<-qres2018+qres2_2018
total_2019<-qres2019+qres2_2019
total_2020<-qres2020+qres2_2020
year=c("2016","2017","2018","2019","2020")
contribution=c(total_2016,total_2017,total_2018,total_2019,total_2020)
df = data.frame(year,contribution)
require(scales)
ggplot(df,aes(year,contribution)) + geom_bar(stat="identity",fill=colors()[128]) + ggtitle("Histogram chart showing the total number of articles per year for both diseases")+scale_y_continuous(labels=comma)
我也想将图形生成为密度图。但是,当我用 geom_density 替换 geom_bar 时,我收到以下错误消息
Groups with fewer than two data points have been dropped.
并且没有绘制任何内容。我做错了什么?
编辑:我应该提到 qres2016 和其他变量都是整数。我得到它们的方式如下:
res2016<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2016,maxdate=2016,retmax=500)
res2017<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2017,maxdate=2017,retmax=500)
res2018<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2018,maxdate=2018,retmax=500)
res2019<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2019,maxdate=2019,retmax=500)
res2020<-EUtilsSummary("Typhoid meningitis",type="esearch",db="pubmed",datetype="pdat",mindate=2020,maxdate=2020,retmax=500)
qres2016<-QueryCount(res2016) #counting results
qres2017<-QueryCount(res2017) #counting results
qres2018<-QueryCount(res2018) #counting results
qres2019<-QueryCount(res2019) #counting results
qres2020<-QueryCount(res2020) #counting results
a<- "Total number of articles the last five years for M434: "
qrestotal<-qres2016+qres2017+qres2018+qres2019+qres2020
print(paste(a,qrestotal))
#searching pubmed for second disease
res2_2016<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2016,maxdate=2016,retmax=500)
res2_2017<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2017,maxdate=2017,retmax=500)
res2_2018<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2018,maxdate=2018,retmax=500)
res2_2019<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2019,maxdate=2019,retmax=500)
res2_2020<-EUtilsSummary("Sequelae of rickets",type="esearch",db="pubmed",datetype="pdat",mindate=2020,maxdate=2020,retmax=500)
qres2_2016<-QueryCount(res2_2016) #counting results
qres2_2017<-QueryCount(res2_2017) #counting results
qres2_2018<-QueryCount(res2_2018) #counting results
qres2_2019<-QueryCount(res2_2019) #counting results
qres2_2020<-QueryCount(res2_2020) #counting results
我需要做的是在密度图中绘制每年的文章数量,但是出现了我上面提到的错误
我试图制作的情节如下所示:
密度图是一种将 x 轴上离散事件的密度显示为 y 轴上的平滑值的方法。您有年度计数,这不适用于密度图。可能最接近的等价物是有一个平滑的面积图。然而,为了公平地做到这一点,您必须对 2020 年的数据进行年度化,否则它不会准确反映出版率。
我认为这与您将要获得的结果差不多:
df$year <- as.numeric(as.character(df$year))
df$annualized_counts <- df$contribution
df$annualized_counts[5] <- df$contribution[5] * 366/lubridate::yday(lubridate::now())
plot_df <- data.frame(year = seq(2016, 2020, length.out = 100),
counts = spline(df$annualized_counts, n = 100)$y)
ggplot(plot_df, aes(year, counts)) +
geom_area(colour = "forestgreen", fill = "forestgreen", alpha = 0.4) +
geom_point(data = df, aes(x = year, y = annualized_counts)) +
labs(title = "Annualized citation count for typhoid meningitis and rickets",
y = "Annualized citations")