如何将几列从一列 table 插入到只有一列 unique/distinct 的另一列?
How can I Insert several columns from one table to another having only 1 column unique/distinct?
我正在尝试创建星型模式,目前正在处理 table 维度。我想将几列从一列 table 复制到另一列,但同时我想让其中一列的结果值唯一。
这些是我正在使用的 table:
DWH_PRICE_PAID_RECORDS
CREATE TABLE "DWH_PRICE_PAID_RECORDS" ("TRANSACTION_ID" VARCHAR(50) NOT NULL, "PRICE" INTEGER, "DATE_OF_TRANSFER" DATE NOT NULL, "PROPERTY_TYPE" CHAR(1), "OLD_NEW" CHAR(1), "DURATION" CHAR(1), "TOWN_CITY" VARCHAR(50), "DISTRICT" VARCHAR(50), "COUNTY" VARCHAR(50), "PPDCATEGORY_TYPE" CHAR(1), "RECORD_TYPE" CHAR(1));
ALTER TABLE "DWH_PRICE_PAID_RECORDS" ADD CONSTRAINT "PK3" PRIMARY KEY ("TRANSACTION_ID");
和DIM_REGION
CREATE TABLE "DIM_REGION" ("REGION_ID" INTEGER generated always as identity (start with 1 increment by 1), "TRANSACTION_ID" VARCHAR(50), "TOWN" VARCHAR(50), "COUNTY" VARCHAR(50), "DISTRICT" VARCHAR(50), "LATITUDE" VARCHAR(50), "LONGITUDE" VARCHAR(50), "COUNTRY_STRING" VARCHAR(50));
ALTER TABLE "DIM_REGION" ADD CONSTRAINT "PK8" PRIMARY KEY ("REGION_ID");
我的第一次尝试是使用 "select distinct" 但这只会删除所有合并的列的所有重复项。我想要一个区域维度,"town" 应该是标识符,以匹配 DIM_REGION 与我稍后将在数据集市上创建的事实 table(称为 DM_PRICE_PAID_RECORDS) .
DWH_PRICE_PAID_RECORDS table 有大约 10k 条记录,但只有 938 个独特的城镇。我想将 dim_region 中的那 938 个城镇作为 ID 以及县、区等其他列
这是可行的,但当然其他一切都是 NULL,但城镇:
INSERT INTO DIM_REGION (TOWN) SELECT (town_city) from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
所以我想我只需要添加额外的列
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
但是当我这样做时,我收到了这条错误信息(错误信息是德语,抱歉,我不得不翻译):
ERROR 42Y36 Column reference: "DWH_PRICE_PAID_RECORDS.COUNTY" is invalid or part of a invalid statement. When using SELECT and GROUP BY the selected columns and statements must be valid group- or aggregation expressions.
你能帮帮我吗?或者你有别的想法我还能怎样得到我想要的结果?
非常感谢!
你太接近了!
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city, county, district;
这应该可以完成工作。使用分组依据时,SELECT 列表中不是聚合的所有内容都必须出现在 GROUP BY 子句中。
顺便说一句,TRANSACTION_ID真的属于table维度吗?
如果其他 2 列无关紧要,您可以这样做:
INSERT INTO DIM_REGION (TOWN, County, District)
SELECT town_city, MAX(county), MAX(district)
FROM DWH_PRICE_PAID_RECORDS
GROUP BY town_city
这将使每个城镇只有 1 行。
我正在尝试创建星型模式,目前正在处理 table 维度。我想将几列从一列 table 复制到另一列,但同时我想让其中一列的结果值唯一。
这些是我正在使用的 table: DWH_PRICE_PAID_RECORDS
CREATE TABLE "DWH_PRICE_PAID_RECORDS" ("TRANSACTION_ID" VARCHAR(50) NOT NULL, "PRICE" INTEGER, "DATE_OF_TRANSFER" DATE NOT NULL, "PROPERTY_TYPE" CHAR(1), "OLD_NEW" CHAR(1), "DURATION" CHAR(1), "TOWN_CITY" VARCHAR(50), "DISTRICT" VARCHAR(50), "COUNTY" VARCHAR(50), "PPDCATEGORY_TYPE" CHAR(1), "RECORD_TYPE" CHAR(1));
ALTER TABLE "DWH_PRICE_PAID_RECORDS" ADD CONSTRAINT "PK3" PRIMARY KEY ("TRANSACTION_ID");
和DIM_REGION
CREATE TABLE "DIM_REGION" ("REGION_ID" INTEGER generated always as identity (start with 1 increment by 1), "TRANSACTION_ID" VARCHAR(50), "TOWN" VARCHAR(50), "COUNTY" VARCHAR(50), "DISTRICT" VARCHAR(50), "LATITUDE" VARCHAR(50), "LONGITUDE" VARCHAR(50), "COUNTRY_STRING" VARCHAR(50));
ALTER TABLE "DIM_REGION" ADD CONSTRAINT "PK8" PRIMARY KEY ("REGION_ID");
我的第一次尝试是使用 "select distinct" 但这只会删除所有合并的列的所有重复项。我想要一个区域维度,"town" 应该是标识符,以匹配 DIM_REGION 与我稍后将在数据集市上创建的事实 table(称为 DM_PRICE_PAID_RECORDS) .
DWH_PRICE_PAID_RECORDS table 有大约 10k 条记录,但只有 938 个独特的城镇。我想将 dim_region 中的那 938 个城镇作为 ID 以及县、区等其他列
这是可行的,但当然其他一切都是 NULL,但城镇:
INSERT INTO DIM_REGION (TOWN) SELECT (town_city) from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
所以我想我只需要添加额外的列
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
但是当我这样做时,我收到了这条错误信息(错误信息是德语,抱歉,我不得不翻译):
ERROR 42Y36 Column reference: "DWH_PRICE_PAID_RECORDS.COUNTY" is invalid or part of a invalid statement. When using SELECT and GROUP BY the selected columns and statements must be valid group- or aggregation expressions.
你能帮帮我吗?或者你有别的想法我还能怎样得到我想要的结果?
非常感谢!
你太接近了!
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city, county, district;
这应该可以完成工作。使用分组依据时,SELECT 列表中不是聚合的所有内容都必须出现在 GROUP BY 子句中。
顺便说一句,TRANSACTION_ID真的属于table维度吗?
如果其他 2 列无关紧要,您可以这样做:
INSERT INTO DIM_REGION (TOWN, County, District)
SELECT town_city, MAX(county), MAX(district)
FROM DWH_PRICE_PAID_RECORDS
GROUP BY town_city
这将使每个城镇只有 1 行。