显示美国的独特温度值

Displaying Unique Temperature Values for the US

我想做一些类似于我在 SQL 中在 R(下方)中所做的事情:

clim_data %>%
select(Year, AverageTemperature, State) %>%
group_by(Year,State) %>%
summarize(value = mean(AverageTemperature), .groups = 'drop') -> clim_data3

colnames(clim_data3)[2] <- "region" 
clim_data3$region<-tolower(clim_data3$region)

clim_data3 %>%
filter(Year==1900) -> clim_data1900
clim_data1900<-clim_data1900[,2:3]

此代码的输出如下所示:

region  value
<chr>   <dbl>
alabama 17.059167
alaska  -5.146500
arizona 15.742917
arkansas    15.893417
california  14.51575

到目前为止,在 SQL 中,我已经设法使用以下代码输出单个状态:

select distinct year, round(avg(AverageTemperature) over (partition by year),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state = 'Alabama' and
    year = 1900

输出如下所示:

year    avgTemp   Country        State
1900    17.06   United States   Alabama

但是,我无法获得每个州的唯一 avgTemp。当我将它带到多个状态时,我会为该查询中的所有状态获得相同的 avgTemp。所以如果我 运行 这样的查询:

select distinct year, round(avg(AverageTemperature) over (partition by year),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state like 'A%' and
    year = 1900

我得到了这些州的平均值。

year    avgTemp   Country        State
1900    15.15   United States   Alaska
1900    15.15   United States   Arizona
1900    15.15   United States   Arkansas
1900    15.15   United States   Alabama

我假设我需要编写某种子查询来遍历每个状态并给出 avgTemp。我也尝试对状态进行分区,但它没有给我想要的东西。我的总体目标是打印出给定年份的每个州的 avgTemp。

数据集包含每年和州的多个值:

select year, round(avg(AverageTemperature) over (partition by year),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state like 'A%' and
    year = 1900

output: 
year    avgTemp  Country         State
1900    15.15   United States   Alabama
1900    15.15   United States   Alabama
1900    15.15   United States   Alabama
1900    15.15   United States   Alabama

阿拉巴马州条目较多,下方还有其他 'A' 州,因此我认为必须使用不同的年份。我只是卡住并将单个状态扩展为多个状态,而没有取查询中所有状态的平均值。数据集可以在这里找到:https://www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data?select=GlobalLandTemperaturesByState.csv

感谢您的帮助!

由于您要查找每个州的平均温度,因此您应该将 列添加到按子句分区。

尝试:

select year, round(avg(AverageTemperature) over (partition by year,State),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state like 'A%' and
    year = 1900

您需要 GROUP BY 和常规聚合函数 AVG(),AVG() over (partition by...) 是一个分析函数而不是常规聚合。 EG

select  Country, State, year, avg(AverageTemperature) as avgTemp
from dbo.landTemps
where AverageTemperature is not null
group by Country, State, year