需要写一个查询而不是 5 个查询来得到所有 5 个状态的结果

Need to write a query instead of 5 queries to get the result of all 5 states

CSV 文件有大约 62000 行,它有州和县(县名在特定州内是唯一的)。

我不得不在视图上编写 5 个查询。每个查询将检索其中一个州 State_Name,日期,MAX(SumConfirmed) 每个州确诊病例数最多的日期。

SELECT State_Name, Date, ConfirmedCases AS Max_ConfirmedCases
FROM covid_by_state
WHERE ConfirmedCases =
    (SELECT max(ConfirmedCases) AS Max_ConfirmedCases
     FROM covid_by_state
     WHERE State_Name='Texas');

上面的查询为我提供了一个特定状态的结果,但我无法找到如何在一个查询中获得 5 个状态的结果。

我想你想要一个 correlated 子查询:

SELECT cbs.State_Name, cbs.Date, cbs.ConfirmedCases as Max_ConfirmedCases
FROM covid_by_state cbs
WHERE cbs.ConfirmedCases = (SELECT max(cbs2.ConfirmedCases) 
                            FROM covid_by_state cbs2
                            WHERE cbs2.State_Name = cbs.State_Name
                           );

这 returns 行同时用于 所有 个状态。

编辑:

如果您希望所有日期都具有特定状态的最大值,那么您可以使用聚合:

SELECT cbs.State_Name, GROUP_CONCAT(cbs.Date) as dates,
       cbs.ConfirmedCases as Max_ConfirmedCases
FROM covid_by_state cbs
WHERE cbs.ConfirmedCases = (SELECT max(cbs2.ConfirmedCases) 
                            FROM covid_by_state cbs2
                            WHERE cbs2.State_Name = cbs.State_Name
                           )
GROUP BY cbs.State_Name, cbs.ConfirmedCases

我正在跳过使用视图;我认为它不会在可读性方面增加任何价值,并且如果您想开始将查询限制在某个日期范围内或无论如何添加其他条件,都不会起作用。

select
    State_Name,
    max_confirmed_cases_date as Date,
    max(ConfirmedCases) as Max_ConfirmedCases
from (
    select
        State_Name,
        first_value(Date) over (partition by State_Name order by ConfirmedCases desc, Date) max_confirmed_cases_date, 
        ConfirmedCases
    from (
        select Date, State_Name, sum(Daily_Count_Cases) ConfirmedCases
        from Covid_By_County
        group by Date, State_Name
    ) daily_state_totals
) daily_state_totals_with_max_cases_date
group by State_Name, max_confirmed_cases_date

最里面的子select相当于你的观点;它在每个日期的每个州获得一行,总共有案例。中间的 subselect 重复其中的每一行,但不使用日期,而是使用 first_value() 来查找案例数量最多的州的日期(在平局的情况下优先选择较早的日期和较晚的日期) ).然后,外部 select 将其减少为每个州一行。

或者,如果您使用的是不支持 window 功能的旧版本:

select
    State_Name,
    date(substr(min(concat(99999999999-ConfirmedCases,Date)),12)) as Date,
    max(ConfirmedCases) as Max_ConfirmedCases
from (
    select Date, State_Name, sum(Daily_Count_Cases) ConfirmedCases
    from Covid_By_County
    group by Date, State_Name
) daily_state_totals
group by State_Name

此查询使用了一个技巧,通过获取对个案和日期都进行编码的字符串的最小值来获取每个州的最大个案日期。