postgresql - 内部 JOIN 无法正常工作
postgresql - Inner JOIN is not working correctly
我有 table 看起来像这样的:
我正在尝试 INNER JOIN 这两个 table,这样我就会得到这样的结果:
time | block_height | differential_pressure |
---------------------+--------------+-----------------------+
2018-09-08 11:14:10 | 83.7 | 286.84 |
2018-09-08 11:14:10 | 83.6 | 282.14 |
2018-09-08 11:14:11 | 83.4 | 298.35 |
2018-09-08 11:14:12 | 83.1 | 298.23 |
2018-09-08 11:14:12 | 82.9 | 294.76 |
2018-09-08 11:14:13 | 82.7 | 288.37 |
但是当我运行下面的查询时:
SELECT * FROM rt_block_height
INNER JOIN rt_differential_pressure
ON rt_block_height.time = rt_differential_pressure.time;
这是我得到的:
我不明白这是怎么回事。似乎添加了一些随机的附加行,但我不知道为什么会这样。原来的table只有6行,但是查询出来的table returns 10.
我不知道此信息是否有帮助,但这是一个 TimescaleDB Hypertable。这是 table 创建的源代码:
CREATE TABLE IF NOT EXISTS public.rt_BLOCK_HEIGHT
(
"time" timestamp without time zone,
BLOCK_HEIGHT double precision
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.rt_BLOCK_HEIGHT
OWNER to postgres;
SELECT create_hypertable('rt_BLOCK_HEIGHT', 'time');
您的时间列不是唯一的。
对于 2018-09-08 11:14:10
时间戳,您有:
block_heightA = 83.7
block_heightB = 83.6
differential_pressureA = 286.84
differential_pressureB = 282.14
因此,当您进行连接时,您将得到两个 2 元素集的笛卡尔积:
2018-09-08 11:14:10 block_heightA differential_pressureA
2018-09-08 11:14:10 block_heightA differential_pressureB
2018-09-08 11:14:10 block_heightB differential_pressureA
2018-09-08 11:14:10 block_heightB differential_pressureB
要获得您想要的结果,您必须决定如何处理每个时间戳的重复值。例如,您可以计算平均值:
SELECT
grouped_block_height.time,
avg_block_height,
avg_differential_pressure
FROM (
SELECT time, avg(block_height) as avg_block_height
FROM rt_block_height
GROUP BY time
) as grouped_block_height
INNER JOIN (
SELECT time, avg(differential_pressure) as avg_differential_pressure
FROM rt_differential_pressure
GROUP BY time
) as grouped_differential_pressure
ON grouped_block_height.time = grouped_differential_pressure.time;
我有 table 看起来像这样的:
我正在尝试 INNER JOIN 这两个 table,这样我就会得到这样的结果:
time | block_height | differential_pressure |
---------------------+--------------+-----------------------+
2018-09-08 11:14:10 | 83.7 | 286.84 |
2018-09-08 11:14:10 | 83.6 | 282.14 |
2018-09-08 11:14:11 | 83.4 | 298.35 |
2018-09-08 11:14:12 | 83.1 | 298.23 |
2018-09-08 11:14:12 | 82.9 | 294.76 |
2018-09-08 11:14:13 | 82.7 | 288.37 |
但是当我运行下面的查询时:
SELECT * FROM rt_block_height
INNER JOIN rt_differential_pressure
ON rt_block_height.time = rt_differential_pressure.time;
这是我得到的:
我不明白这是怎么回事。似乎添加了一些随机的附加行,但我不知道为什么会这样。原来的table只有6行,但是查询出来的table returns 10.
我不知道此信息是否有帮助,但这是一个 TimescaleDB Hypertable。这是 table 创建的源代码:
CREATE TABLE IF NOT EXISTS public.rt_BLOCK_HEIGHT
(
"time" timestamp without time zone,
BLOCK_HEIGHT double precision
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.rt_BLOCK_HEIGHT
OWNER to postgres;
SELECT create_hypertable('rt_BLOCK_HEIGHT', 'time');
您的时间列不是唯一的。
对于 2018-09-08 11:14:10
时间戳,您有:
block_heightA = 83.7
block_heightB = 83.6
differential_pressureA = 286.84
differential_pressureB = 282.14
因此,当您进行连接时,您将得到两个 2 元素集的笛卡尔积:
2018-09-08 11:14:10 block_heightA differential_pressureA
2018-09-08 11:14:10 block_heightA differential_pressureB
2018-09-08 11:14:10 block_heightB differential_pressureA
2018-09-08 11:14:10 block_heightB differential_pressureB
要获得您想要的结果,您必须决定如何处理每个时间戳的重复值。例如,您可以计算平均值:
SELECT
grouped_block_height.time,
avg_block_height,
avg_differential_pressure
FROM (
SELECT time, avg(block_height) as avg_block_height
FROM rt_block_height
GROUP BY time
) as grouped_block_height
INNER JOIN (
SELECT time, avg(differential_pressure) as avg_differential_pressure
FROM rt_differential_pressure
GROUP BY time
) as grouped_differential_pressure
ON grouped_block_height.time = grouped_differential_pressure.time;