SQL 通过仅使用一个主要变量(商店)分组并查找其他变量中的客户百分比来查询数据

SQL Querying of Data by grouping with only one main variable(Store) and finding the percentage of customers in other variable

表 - 商店

Stores Date Customer_ID
A 01/01/2020 1111
C 01/01/2020 1111
F 02/01/2020 1234
A 02/01/2020 1111
A 02/01/2020 2222

表 - 客户

Customer_ID Age_Group Income_Level
1111 26-30 Low
1234 25 and below Mid
2222 31-60 High

我想知道如何得到这个输出。

Stores Age_Group Percentage_by_Age Income_Level Percentage_By_Income
A 25 and below 10 Low 80
A 25 and below 10 Mid 10
A 25 and below 10 High 10
A 26 - 30 42 Low 15
A 26 - 30 42 Mid 65
A 26 - 30 42 High 20
A 31 - 60 48 Low 30
A 31 - 60 48 Mid 50
A 31 - 60 48 High 20

我正在使用 SQL 从不同的 table 进行查询。 首先我需要按商店汇总顾客数量,然后在每个商店中,我想找出特定年龄组(25 岁及以下)有多少顾客光顾了商店 A,以及其中有多少人处于哪个收入水平。

我可以知道如何解决这个问题吗?

谢谢。

我目前的solution/thought进程

SELECT 
    stores AS Stores,
    Age_Group AS Age,
    Income_Level AS Income
    COUNT(DISTINCT(Customer_ID)) AS Number_of_Customers
FROM tables JOIN tables....
GROUP BY Stores, Ages, Income;

然后手动计算百分比。 但这似乎不对。 有没有一种方法可以仅使用 SQL 来生成示例输出 table?

根据您的要求,可以使用Common Table Expressions。您可以使用下面的代码来获得预期的输出。

 WITH
 data_for_percent_by_income AS (
 SELECT
   COUNT(customer_id) AS cus_count_in_per_income_level_and_agegrp,
   Age_group AS age_g,income_level AS inc_lvl
 FROM
   `project.dataset.Customer2`
 WHERE
   customer_id IN (
   SELECT customer_id
   FROM
     `project.dataset.Store5`
   WHERE stores='A')
 GROUP BY
   Age_group,income_level),tot_cus_in_defined_income_level AS (
 SELECT
   COUNT(customer_id) AS cus_count_in_per_income_level,Age_group AS ag
 FROM
   `project.dataset.Customer2`
 WHERE
   customer_id IN (
   SELECT
     customer_id
   FROM
     `project.dataset.Store5`
   WHERE stores='A')
 GROUP BY
   Age_group),
 tot_cus_storeA AS(
 SELECT
   COUNT(*) AS tot_cus_in_A
 FROM
   `project.dataset.Customer2`
 WHERE customer_id IN (
   SELECT customer_id
   FROM
     `project.dataset.Store5`
   WHERE stores='A') ),
 final_view AS(
 SELECT
   ROUND(cus_count_in_per_income_level_and_agegrp*100/cus_count_in_per_income_level) AS p_by_inc,
   age_g,inc_lvl
 FROM
   data_for_percent_by_income
 INNER JOIN
   tot_cus_in_defined_income_level
 ON
   data_for_percent_by_income.age_g=tot_cus_in_defined_income_level.ag )
SELECT
 stores,tot_cus_in_defined_income_level.ag AS age_group,income_level,
 ROUND(cus_count_in_per_income_level*100/tot_cus_in_A) AS percentage_by_age,
 p_by_inc AS percentage_by_income
FROM
 tot_cus_in_defined_income_level,tot_cus_storeA,`project.dataset.Customer2`,`project.dataset.Store5`
INNER JOIN
 final_view
ON
 age_group=final_view.age_g AND income_level=final_view.inc_lvl
WHERE
 tot_cus_in_defined_income_level.ag = Age_group AND stores='A'
GROUP BY
 stores,percentage_by_age,age_group,income_level,percentage_by_income
ORDER BY Age_group

我附上了输入table和输出table的截图。

客户Table

商店Table

输出Table

SELECT
    s.Stores AS Stores,
    c.age_group AS Age,
    a.income_level AS Affluence,
    CAST(COUNT(DISTINCT c.Customer_ID) AS numeric)*100/SUM(CAST(COUNT(DISTINCT c.Customer_ID) AS numeric)) OVER(PARTITION BY s.Stores ) AS Perc_of_Members

这就是我最后所做的。