SQL 聚合问题
SQL Aggregation issue
我有两张桌子;储蓄和支出。我需要的是将它们按月份和年份汇总成一行,并且 Site/Location。
Tables设置如下。数据显然已清理,请忽略可爱的日期结构。
消费 Table
Site# Region Total Spend Month Year
52 Region1 9.01 8 2015
52 Region1 6.41 8 2015
52 Region1 5.97 8 2015
52 Region1 0.54 8 2015
52 Region1 1.42 8 2015
52 Region1 4.79 8 2015
52 Region1 3.82 8 2015
52 Region1 3.01 8 2015
52 Region1 0.48 8 2015
52 Region1 8.35 8 2015
52 Region1 6.33 8 2015
52 Region1 ,360 8 2015
52 Region1 ,4292 8 2015
52 Region1 ,7191 8 2015
52 Region1 .77 8 2015
52 Region1 .48 8 2015
52 Region1 .66 8 2015
52 Region1 .59 8 2015
52 Region1 .03 8 2015
52 Region1 .77 8 2015
52 Region1 ,3851 8 2015
52 Region1 4.07 8 2015
52 Region1 7.81 8 2015
52 Region1 ,7327 8 2015
52 Region1 9.94 8 2015
52 Region1 0.72 8 2015
52 Region1 7.31 8 2015
52 Region1 ,5069 8 2015
52 Region1 7.81 8 2015
52 Region1 ,7397 8 2015
52 Region1 4.74 8 2015
52 Region1 4.96 8 2015
52 Region1 ,5456 8 2015
52 Region1 4.53 8 2015
52 Region1 ,9049 8 2015
52 Region1 .21 8 2015
52 Region1 2.84 8 2015
52 Region1 8.43 8 2015
52 Region1 2.64 8 2015
52 Region1 7.25 8 2015
52 Region1 ,8732 8 2015
52 Region1 8.03 8 2015
52 Region1 3.72 8 2015
52 Region1 5.68 8 2015
52 Region1 .70 8 2015
52 Region1 2.09 8 2015
52 Region1 6.91 8 2015
52 Region1 ,0666 8 2015
52 Region1 ,0967 8 2015
52 Region1 8.57 8 2015
52 Region1 9.06 8 2015
52 Region1 .28 8 2015
52 Region1 ,2232 8 2015
52 Region1 4.80 8 2015
52 Region1 5.42 8 2015
52 Region1 .88 8 2015
52 Region1 .36 8 2015
52 Region1 5.33 8 2015
52 Region1 8.49 8 2015
52 Region1 .86 8 2015
52 Region1 9.09 8 2015
52 Region1 .83 8 2015
52 Region1 ,2097 8 2015
52 Region1 .55 8 2015
52 Region1 ,7307 8 2015
52 Region1 .46 8 2015
52 Region1 6.71 8 2015
52 Region1 .94 8 2015
52 Region1 9.58 8 2015
52 Region1 6.86 8 2015
52 Region1 ,5295 8 2015
52 Region1 4.03 8 2015
52 Region1 ,8494 8 2015
52 Region1 ,4338 8 2015
52 Region1 4.40 8 2015
52 Region1 .20 8 2015
52 Region1 1.45 8 2015
52 Region1 ,6264 8 2015
52 Region1 4.53 8 2015
52 Region1 5.24 8 2015
52 Region1 9.69 8 2015
52 Region1 8.57 8 2015
52 Region1 8.43 8 2015
52 Region1 .95 8 2015
52 Region1 .95 8 2015
52 Region1 .09 8 2015
52 Region1 ,8848 8 2015
52 Region1 5.33 8 2015
52 Region1 ,3658 8 2015
52 Region1 0.30 8 2015
52 Region1 ,445 8 2015
52 Region1 4.12 8 2015
52 Region1 0.68 8 2015
52 Region1 .47 8 2015
52 Region1 3.65 8 2015
52 Region1 6.40 8 2015
52 Region1 6.01 8 2015
节省 Table
Month Year Site Region Total Savings
8 2015 52 Region1 ,950.05
8 2015 52 Region1 4.49
8 2015 52 Region1 ,548.54
8 2015 52 Region1 ,433.42
8 2015 52 Region1 ,073.94
8 2015 52 Region1 ,956.75
8 2015 52 Region1 5.30
8 2015 52 Region1 ,107.72
8 2015 52 Region1 2.97
8 2015 52 Region1 ,580.52
我的预期输出如下
Site# Region Month Year Total Savings Total Spend
52 Region1 8 2015 16453.7 109866.17
显然这里有更多的数据,由于数据敏感性,我的查询比我能给出的任何示例都长得多。但是我 运行 的查询接近于此[=19=]
SELECT
[s].[Month],
[s].[Year],
[s].[Site],
[s].[Region],
SUM([s].[Total Savings]),
SUM([sp].Total Spend)
FROM [Savings] AS [s]
LEFT JOIN (
SELECT
[Total Spend]
FROM [Spend]
) AS [sp]
ON [s].[Month] = [sp].[Month]
AND [s].[Year] = [sp].[Year]
AND [s].[Site] = [sp].[Site]
GROUP BY
[s].[Month],
[s].[Year],
[s].[Site],
[s].[Region]
代码的问题是我得到了很多意想不到的聚合..并且这些值正在成倍增加。有时我可以得到正确计算的节省,但它在每一行上求和。
我的问题是什么是最合适的方式来组合这样结构化的数据并能够报告每一列(假设它们不是唯一的)。我知道我可以为每一列做一个子查询,但我觉得这是一种可怕的做法。
TL;DR - 我有两个表需要通过聚合进行连接,并且能够 select 来自两个表的所有列。
这是在 Microsoft SQL 上通过 Tableau
编辑
刚试过这个查询
SELECT
SUM(CAST([ms].[USD_SavingsAmt] AS decimal(38,2))) AS [Total Savings],
SUM([s].[USD_SpendAmt]) AS [Total Spend],
[ms].[MOR_Reporting_Year] AS [Year],
[ms].[MOR_Reporting_Month] AS [Month],
[ms].[Site#] AS [Site]
FROM [MonthlySavings_14637] AS [ms], [MonthlySpend_14637] AS [s]
WHERE [ms].[MOR_Reporting_Year] = [s].[MOR_Reporting_Year]
AND [ms].[MOR_Reporting_Month] = [s].[MOR_Reporting_Month]
AND [ms].[Site#] = [s].[Site#]
AND [s].[Site#] = '52'
AND [ms].[MOR_Reporting_Month] = '8'
AND [ms].[MOR_Reporting_Year] = '2015'
GROUP BY
[ms].[MOR_Reporting_Year],
[ms].[MOR_Reporting_Month],
[ms].[Site#]
得到这个结果
Site Month Total Savings Total Spend Year
52 8 1,596,008.90 1,098,661.65 2,015
值重复。
TTG Guy,用你的逻辑
SELECT
SUM([ms].[Total Savings]) AS [Total Savings],
SUM([s].[USD_SpendAmt]) AS [Total Spend],
[s].[MOR_Reporting_Year] AS [Year],
[s].[MOR_Reporting_Month] AS [Month],
[s].[Site#] AS [Site]
FROM [MonthlySpend_14637] AS [s]
INNER JOIN
(
SELECT
SUM([MonthlySavings_14637].[USD_SavingsAmt]) AS [Total Savings],
[MonthlySavings_14637].[MOR_Reporting_Month] AS [Month],
[MonthlySavings_14637].[MOR_Reporting_Year] AS [Year],
[MonthlySavings_14637].[Site#] AS [Site]
FROM [MonthlySavings_14637]
GROUP BY [MOR_Reporting_Month], [MOR_Reporting_Year], [Site#]
) AS [ms]
ON [ms].[Site]=[s].[Site#]
AND [ms].[Month] = [s].[MOR_Reporting_Month]
AND [ms].[Year] = [s].[MOR_Reporting_Year]
WHERE
[s].[Site#] = '52'
AND [s].[MOR_Reporting_Month] = '8'
AND [s].[MOR_Reporting_Year] = '2015'
GROUP BY [s].[MOR_Reporting_Month], [s].[MOR_Reporting_Year], [s].[Site#]
我得到了
Site Month Total Savings Total Spend Year
52 8 1,596,008.90 109,866.17 2,015
花费正确!
select
([s_agg].[Month],
[s_agg].[Year],
[s_agg].[Site],
[s_agg].[Region],
[s_agg].[tot_sav],
[sp_agg].[tot_sp]
from (
select [s].[Month],
[s].[Year],
[s].[Site],
[s].[Region],
SUM([s].[Total Savings] [tot_sav]
FROM [Savings] AS [s]
GROUP BY
[s].[Month],
[s].[Year],
[s].[Site],
[s].[Region]
) AS [s_agg]
LEFT JOIN (
SELECT [Month],
[Year],
[Site],
[Region],
SUM([Total Spend]) [tot_sp]
FROM [Spend] AS [sp]
GROUP BY [sp].[Month],
[sp].[Year],
[sp].[Site],
[sp].[Region]
) AS [sp_agg]
ON [s_agg].[Month] = [sp_agg].[Month]
AND [s_agg].[Year] = [sp_agg].[Year]
AND [s_agg].[Site] = [sp_agg].[Site]
你总是可以 UNION 和他们 SUM 他们:
SELECT site, region, month, year, sum([total savings]) [total savings]
, sum([total spend]) [total spend]
from (select site, region, month, year, [Total Savings] [Total Savings], 0 as [total spend]
from savings
UNION ALL
select site#, region, month, year, 0, [TotalSpend] [Total Savings]
from spend) unionsubquery
group by site, region, month, year
我知道你在几个地方说过你在使用子查询时遇到了问题,但是这个对我来说似乎工作得很好:
SELECT sp.site#, sp.region,sp.month,sp.year,savingstotals.[total savings], sum([totalspend]) as [Total Spend]
FROM
spend sp
INNER JOIN
(SELECT site, region, month, year , sum([total savings]) [total savings]
from savings
group by site, region, month, year ) SavingsTotals ON
Savingstotals.site=sp.site#
AND Savingstotals.month=sp.month
AND Savingstotals.year=sp.year
AND Savingstotals.region=sp.region
group by sp.site#, sp.region,sp.month,sp.year, SavingsTotals.[total savings]
我有两张桌子;储蓄和支出。我需要的是将它们按月份和年份汇总成一行,并且 Site/Location。
Tables设置如下。数据显然已清理,请忽略可爱的日期结构。
消费 Table
Site# Region Total Spend Month Year
52 Region1 9.01 8 2015
52 Region1 6.41 8 2015
52 Region1 5.97 8 2015
52 Region1 0.54 8 2015
52 Region1 1.42 8 2015
52 Region1 4.79 8 2015
52 Region1 3.82 8 2015
52 Region1 3.01 8 2015
52 Region1 0.48 8 2015
52 Region1 8.35 8 2015
52 Region1 6.33 8 2015
52 Region1 ,360 8 2015
52 Region1 ,4292 8 2015
52 Region1 ,7191 8 2015
52 Region1 .77 8 2015
52 Region1 .48 8 2015
52 Region1 .66 8 2015
52 Region1 .59 8 2015
52 Region1 .03 8 2015
52 Region1 .77 8 2015
52 Region1 ,3851 8 2015
52 Region1 4.07 8 2015
52 Region1 7.81 8 2015
52 Region1 ,7327 8 2015
52 Region1 9.94 8 2015
52 Region1 0.72 8 2015
52 Region1 7.31 8 2015
52 Region1 ,5069 8 2015
52 Region1 7.81 8 2015
52 Region1 ,7397 8 2015
52 Region1 4.74 8 2015
52 Region1 4.96 8 2015
52 Region1 ,5456 8 2015
52 Region1 4.53 8 2015
52 Region1 ,9049 8 2015
52 Region1 .21 8 2015
52 Region1 2.84 8 2015
52 Region1 8.43 8 2015
52 Region1 2.64 8 2015
52 Region1 7.25 8 2015
52 Region1 ,8732 8 2015
52 Region1 8.03 8 2015
52 Region1 3.72 8 2015
52 Region1 5.68 8 2015
52 Region1 .70 8 2015
52 Region1 2.09 8 2015
52 Region1 6.91 8 2015
52 Region1 ,0666 8 2015
52 Region1 ,0967 8 2015
52 Region1 8.57 8 2015
52 Region1 9.06 8 2015
52 Region1 .28 8 2015
52 Region1 ,2232 8 2015
52 Region1 4.80 8 2015
52 Region1 5.42 8 2015
52 Region1 .88 8 2015
52 Region1 .36 8 2015
52 Region1 5.33 8 2015
52 Region1 8.49 8 2015
52 Region1 .86 8 2015
52 Region1 9.09 8 2015
52 Region1 .83 8 2015
52 Region1 ,2097 8 2015
52 Region1 .55 8 2015
52 Region1 ,7307 8 2015
52 Region1 .46 8 2015
52 Region1 6.71 8 2015
52 Region1 .94 8 2015
52 Region1 9.58 8 2015
52 Region1 6.86 8 2015
52 Region1 ,5295 8 2015
52 Region1 4.03 8 2015
52 Region1 ,8494 8 2015
52 Region1 ,4338 8 2015
52 Region1 4.40 8 2015
52 Region1 .20 8 2015
52 Region1 1.45 8 2015
52 Region1 ,6264 8 2015
52 Region1 4.53 8 2015
52 Region1 5.24 8 2015
52 Region1 9.69 8 2015
52 Region1 8.57 8 2015
52 Region1 8.43 8 2015
52 Region1 .95 8 2015
52 Region1 .95 8 2015
52 Region1 .09 8 2015
52 Region1 ,8848 8 2015
52 Region1 5.33 8 2015
52 Region1 ,3658 8 2015
52 Region1 0.30 8 2015
52 Region1 ,445 8 2015
52 Region1 4.12 8 2015
52 Region1 0.68 8 2015
52 Region1 .47 8 2015
52 Region1 3.65 8 2015
52 Region1 6.40 8 2015
52 Region1 6.01 8 2015
节省 Table
Month Year Site Region Total Savings
8 2015 52 Region1 ,950.05
8 2015 52 Region1 4.49
8 2015 52 Region1 ,548.54
8 2015 52 Region1 ,433.42
8 2015 52 Region1 ,073.94
8 2015 52 Region1 ,956.75
8 2015 52 Region1 5.30
8 2015 52 Region1 ,107.72
8 2015 52 Region1 2.97
8 2015 52 Region1 ,580.52
我的预期输出如下
Site# Region Month Year Total Savings Total Spend
52 Region1 8 2015 16453.7 109866.17
显然这里有更多的数据,由于数据敏感性,我的查询比我能给出的任何示例都长得多。但是我 运行 的查询接近于此[=19=]
SELECT
[s].[Month],
[s].[Year],
[s].[Site],
[s].[Region],
SUM([s].[Total Savings]),
SUM([sp].Total Spend)
FROM [Savings] AS [s]
LEFT JOIN (
SELECT
[Total Spend]
FROM [Spend]
) AS [sp]
ON [s].[Month] = [sp].[Month]
AND [s].[Year] = [sp].[Year]
AND [s].[Site] = [sp].[Site]
GROUP BY
[s].[Month],
[s].[Year],
[s].[Site],
[s].[Region]
代码的问题是我得到了很多意想不到的聚合..并且这些值正在成倍增加。有时我可以得到正确计算的节省,但它在每一行上求和。
我的问题是什么是最合适的方式来组合这样结构化的数据并能够报告每一列(假设它们不是唯一的)。我知道我可以为每一列做一个子查询,但我觉得这是一种可怕的做法。
TL;DR - 我有两个表需要通过聚合进行连接,并且能够 select 来自两个表的所有列。
这是在 Microsoft SQL 上通过 Tableau
编辑
刚试过这个查询
SELECT
SUM(CAST([ms].[USD_SavingsAmt] AS decimal(38,2))) AS [Total Savings],
SUM([s].[USD_SpendAmt]) AS [Total Spend],
[ms].[MOR_Reporting_Year] AS [Year],
[ms].[MOR_Reporting_Month] AS [Month],
[ms].[Site#] AS [Site]
FROM [MonthlySavings_14637] AS [ms], [MonthlySpend_14637] AS [s]
WHERE [ms].[MOR_Reporting_Year] = [s].[MOR_Reporting_Year]
AND [ms].[MOR_Reporting_Month] = [s].[MOR_Reporting_Month]
AND [ms].[Site#] = [s].[Site#]
AND [s].[Site#] = '52'
AND [ms].[MOR_Reporting_Month] = '8'
AND [ms].[MOR_Reporting_Year] = '2015'
GROUP BY
[ms].[MOR_Reporting_Year],
[ms].[MOR_Reporting_Month],
[ms].[Site#]
得到这个结果
Site Month Total Savings Total Spend Year
52 8 1,596,008.90 1,098,661.65 2,015
值重复。
TTG Guy,用你的逻辑
SELECT
SUM([ms].[Total Savings]) AS [Total Savings],
SUM([s].[USD_SpendAmt]) AS [Total Spend],
[s].[MOR_Reporting_Year] AS [Year],
[s].[MOR_Reporting_Month] AS [Month],
[s].[Site#] AS [Site]
FROM [MonthlySpend_14637] AS [s]
INNER JOIN
(
SELECT
SUM([MonthlySavings_14637].[USD_SavingsAmt]) AS [Total Savings],
[MonthlySavings_14637].[MOR_Reporting_Month] AS [Month],
[MonthlySavings_14637].[MOR_Reporting_Year] AS [Year],
[MonthlySavings_14637].[Site#] AS [Site]
FROM [MonthlySavings_14637]
GROUP BY [MOR_Reporting_Month], [MOR_Reporting_Year], [Site#]
) AS [ms]
ON [ms].[Site]=[s].[Site#]
AND [ms].[Month] = [s].[MOR_Reporting_Month]
AND [ms].[Year] = [s].[MOR_Reporting_Year]
WHERE
[s].[Site#] = '52'
AND [s].[MOR_Reporting_Month] = '8'
AND [s].[MOR_Reporting_Year] = '2015'
GROUP BY [s].[MOR_Reporting_Month], [s].[MOR_Reporting_Year], [s].[Site#]
我得到了
Site Month Total Savings Total Spend Year
52 8 1,596,008.90 109,866.17 2,015
花费正确!
select
([s_agg].[Month],
[s_agg].[Year],
[s_agg].[Site],
[s_agg].[Region],
[s_agg].[tot_sav],
[sp_agg].[tot_sp]
from (
select [s].[Month],
[s].[Year],
[s].[Site],
[s].[Region],
SUM([s].[Total Savings] [tot_sav]
FROM [Savings] AS [s]
GROUP BY
[s].[Month],
[s].[Year],
[s].[Site],
[s].[Region]
) AS [s_agg]
LEFT JOIN (
SELECT [Month],
[Year],
[Site],
[Region],
SUM([Total Spend]) [tot_sp]
FROM [Spend] AS [sp]
GROUP BY [sp].[Month],
[sp].[Year],
[sp].[Site],
[sp].[Region]
) AS [sp_agg]
ON [s_agg].[Month] = [sp_agg].[Month]
AND [s_agg].[Year] = [sp_agg].[Year]
AND [s_agg].[Site] = [sp_agg].[Site]
你总是可以 UNION 和他们 SUM 他们:
SELECT site, region, month, year, sum([total savings]) [total savings]
, sum([total spend]) [total spend]
from (select site, region, month, year, [Total Savings] [Total Savings], 0 as [total spend]
from savings
UNION ALL
select site#, region, month, year, 0, [TotalSpend] [Total Savings]
from spend) unionsubquery
group by site, region, month, year
我知道你在几个地方说过你在使用子查询时遇到了问题,但是这个对我来说似乎工作得很好:
SELECT sp.site#, sp.region,sp.month,sp.year,savingstotals.[total savings], sum([totalspend]) as [Total Spend]
FROM
spend sp
INNER JOIN
(SELECT site, region, month, year , sum([total savings]) [total savings]
from savings
group by site, region, month, year ) SavingsTotals ON
Savingstotals.site=sp.site#
AND Savingstotals.month=sp.month
AND Savingstotals.year=sp.year
AND Savingstotals.region=sp.region
group by sp.site#, sp.region,sp.month,sp.year, SavingsTotals.[total savings]