在 IMPLA/HIVE 中添加带有 SELECT 的新列后,旧表数据变为 NULL
Old tables data becomes NULL after adding a new column with a SELECT in IMPLA/HIVE
我正在尝试使用 (SELECT,JOIN) 查询将数据添加到 Impala 中的新列,一旦我将数据添加到新列,我就会丢失所有其他列的数据数据(它们变为 NULL)。
这里我创建第一个table:
CREATE TABLE mng_exp.KPI_LATENCE_JOUR
(
CODEINSEE INT,
IMEI BIGINT,
SEMAINE INT,
MOYENNE_LATENCE INT,
MAXIMUM_LATENCE INT,
MINIMUM_LATENCE INT
)
我将数据添加到 table:
INSERT INTO mng_exp.KPI_LATENCE_JOUR (CODEINSEE,IMEI, SEMAINE, MOYENNE_LATENCE,MAXIMUM_LATENCE,MINIMUM_LATENCE,TRANCHE_DE_LATENCE)
SELECT codeinsee, device_dim__imei as IMEI,weekofyear(jour) as SEMAINE, cast(round(avg(rtt_avg_ms)) as integer) as MOYENNE_LATENCE,
cast(round(avg(rtt_max_ms)) as integer) as MAXIMUM_LATENCE, cast(round(avg(rtt_min_ms)) as integer) as MINIMUM_LATENCE ,
CASE WHEN ( round(avg(rtt_avg_ms)) > 0 and round(avg(rtt_avg_ms)) <= 10 ) THEN 0
WHEN ( round(avg(rtt_avg_ms)) > 10 and round(avg(rtt_avg_ms)) <= 20 ) THEN 1
WHEN ( round(avg(rtt_avg_ms)) > 20 and round(avg(rtt_avg_ms)) <= 30 ) THEN 2
WHEN ( round(avg(rtt_avg_ms)) > 30 ) THEN 3 END AS Tranche_de_latence
FROM mscore.mscore where operateur = 'BT_HZ' and year(jour) = 2019 group by device_dim__imei,weekofyear(jour),codeinsee
# I Add a new column
ALTER TABLE mng_exp.kpi_latence_jour ADD COLUMNS (srv_id BIGINT)
#Here data is good and new column srv_id is NULL
我将数据添加到新列:
INSERT INTO mng_exp.KPI_LATENCE_jour (srv_id)
SELECT CAST(dng_fai_cli_eqt_iad.srv_id AS BIGINT)
FROM msf_exploratoire.dng_fai_cli_eqt_iad
INNER JOIN mng_exp.kpi_latence_jour ON (dng_fai_cli_eqt_iad.num_serie = kpi_latence_jour.imei);
问题在这里:srv_id
没问题,旧的列变成了 NULL。
我没有查询错误,但我丢失了所有旧数据
你确定你丢失了所有的旧数据,或者如果你执行:
select * from mng_exp.KPI_LATENCE_JOUR
您还将看到:
- 第一组行(srv_id为空的行);
- 第二组行 - 唯一填充的列是 srv_id?
您想要的是更新第一组行的 SRV_ID。
您可以在 imapala 中查看有关更新的更多详细信息,here。
您只插入了一列。使用 INSERT OVERWRITE 并添加所有其他列:
INSERT OVERWRITE TABLE mng_exp.KPI_LATENCE_jour (CODEINSEE,IMEI, SEMAINE, MOYENNE_LATENCE,MAXIMUM_LATENCE,MINIMUM_LATENCE,TRANCHE_DE_LATENCE,srv_id)
SELECT b.CODEINSEE,
b.IMEI,
b.SEMAINE,
b.MOYENNE_LATENCE,
b.MAXIMUM_LATENCE,
b.MINIMUM_LATENCE,
b.TRANCHE_DE_LATENCE,
CAST(a.srv_id AS BIGINT) srv_id
FROM msf_exploratoire.dng_fai_cli_eqt_iad a
INNER JOIN mng_exp.kpi_latence_jour b ON (a.num_serie = b.imei)
;
我正在尝试使用 (SELECT,JOIN) 查询将数据添加到 Impala 中的新列,一旦我将数据添加到新列,我就会丢失所有其他列的数据数据(它们变为 NULL)。
这里我创建第一个table:
CREATE TABLE mng_exp.KPI_LATENCE_JOUR
(
CODEINSEE INT,
IMEI BIGINT,
SEMAINE INT,
MOYENNE_LATENCE INT,
MAXIMUM_LATENCE INT,
MINIMUM_LATENCE INT
)
我将数据添加到 table:
INSERT INTO mng_exp.KPI_LATENCE_JOUR (CODEINSEE,IMEI, SEMAINE, MOYENNE_LATENCE,MAXIMUM_LATENCE,MINIMUM_LATENCE,TRANCHE_DE_LATENCE)
SELECT codeinsee, device_dim__imei as IMEI,weekofyear(jour) as SEMAINE, cast(round(avg(rtt_avg_ms)) as integer) as MOYENNE_LATENCE,
cast(round(avg(rtt_max_ms)) as integer) as MAXIMUM_LATENCE, cast(round(avg(rtt_min_ms)) as integer) as MINIMUM_LATENCE ,
CASE WHEN ( round(avg(rtt_avg_ms)) > 0 and round(avg(rtt_avg_ms)) <= 10 ) THEN 0
WHEN ( round(avg(rtt_avg_ms)) > 10 and round(avg(rtt_avg_ms)) <= 20 ) THEN 1
WHEN ( round(avg(rtt_avg_ms)) > 20 and round(avg(rtt_avg_ms)) <= 30 ) THEN 2
WHEN ( round(avg(rtt_avg_ms)) > 30 ) THEN 3 END AS Tranche_de_latence
FROM mscore.mscore where operateur = 'BT_HZ' and year(jour) = 2019 group by device_dim__imei,weekofyear(jour),codeinsee
# I Add a new column
ALTER TABLE mng_exp.kpi_latence_jour ADD COLUMNS (srv_id BIGINT)
#Here data is good and new column srv_id is NULL
我将数据添加到新列:
INSERT INTO mng_exp.KPI_LATENCE_jour (srv_id)
SELECT CAST(dng_fai_cli_eqt_iad.srv_id AS BIGINT)
FROM msf_exploratoire.dng_fai_cli_eqt_iad
INNER JOIN mng_exp.kpi_latence_jour ON (dng_fai_cli_eqt_iad.num_serie = kpi_latence_jour.imei);
问题在这里:srv_id
没问题,旧的列变成了 NULL。
我没有查询错误,但我丢失了所有旧数据
你确定你丢失了所有的旧数据,或者如果你执行:
select * from mng_exp.KPI_LATENCE_JOUR
您还将看到:
- 第一组行(srv_id为空的行);
- 第二组行 - 唯一填充的列是 srv_id?
您想要的是更新第一组行的 SRV_ID。
您可以在 imapala 中查看有关更新的更多详细信息,here。
您只插入了一列。使用 INSERT OVERWRITE 并添加所有其他列:
INSERT OVERWRITE TABLE mng_exp.KPI_LATENCE_jour (CODEINSEE,IMEI, SEMAINE, MOYENNE_LATENCE,MAXIMUM_LATENCE,MINIMUM_LATENCE,TRANCHE_DE_LATENCE,srv_id)
SELECT b.CODEINSEE,
b.IMEI,
b.SEMAINE,
b.MOYENNE_LATENCE,
b.MAXIMUM_LATENCE,
b.MINIMUM_LATENCE,
b.TRANCHE_DE_LATENCE,
CAST(a.srv_id AS BIGINT) srv_id
FROM msf_exploratoire.dng_fai_cli_eqt_iad a
INNER JOIN mng_exp.kpi_latence_jour b ON (a.num_serie = b.imei)
;