使用字典的 Clickhouse 查询
Clickhouse query with dictionary
我从这里导入了准时航空公司的数据库https://clickhouse.com/docs/en/getting-started/example-datasets/ontime/
然后我创建了一个字典,将 2 位飞机代码映射到公司名称,如下所示:
id,code,company
1,UA,United Airlines
2,HA,Hawaiian Airlines
3,OO,SkyWest
4,B6,Jetblue Airway
5,QX,Horizon Air
6,YX,Republic Airway
7,G4,Allegiant Air
...
..
我使用这个查询来生成它,它似乎有效:
CREATE DICTIONARY airlinecompany
(
id UInt64,
code String,
company String
)
PRIMARY KEY id
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(FLAT())
LIFETIME(3600)
在主数据库中(ontime)看起来像这样:
SELECT Reporting_Airline AS R_air
FROM ontime
GROUP BY R_air
LIMIT 4
┌─R_air─┐
│ UA │
│ HA │
│ OO │
│ B6 │
└───────┘
我想要做的是有一个使用 R_air 的 2 代码值的 table,然后根据 airlinecompany
字典检查它以创建一个映射,即
R_Air Company
UA | United Airlines
HA | Hawaiian Airlines
00 | SkyWest
...
..
但我似乎无法正确形成此查询:
SELECT
Reporting_Airline AS R_Air,
dictGetString('airlinecompany', 'company', R_Air) AS company
FROM ontime
GROUP BY R_Air
Received exception from server (version 22.3.3):
Code: 6. DB::Exception: Received from localhost:9000. DB::Exception: Cannot parse string 'UA' as UInt64: syntax error at begin of string. Note: there are toUInt64OrZero and toUInt64OrNull functions, which returns zero/NULL instead of throwing exception.: while executing 'FUNCTION dictGetString('airlinecompany' :: 1, 'company' :: 2, Reporting_Airline :: 0) -> dictGetString('airlinecompany', 'company', Reporting_Airline) String : 4'. (CANNOT_PARSE_TEXT)
我错过了什么?我不知道为什么它认为 UA 是一个 UInt64
布局 = COMPLEX_KEY_HASHED
CREATE DICTIONARY airlinecompany
(
id UInt64,
code String,
company String
)
PRIMARY KEY code
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(COMPLEX_KEY_HASHED())
LIFETIME(3600)
SELECT dictGet('airlinecompany', 'company', tuple('UA'))
┌─dictGet('airlinecompany', 'company', tuple('UA'))─┐
│ United Airlines │
└───────────────────────────────────────────────────┘
SELECT Reporting_Airline AS R_air,
dictGetString('airlinecompany', 'company', tuple(R_Air)) AS company
FROM ontime
LIMIT 4;
┌─R_Air─┬─company───────────┐
│ B6 │ Jetblue Airway │
│ G4 │ Allegiant Air │
│ HA │ Hawaiian Airlines │
│ OO │ SkyWest │
└───────┴───────────────────┘
我从这里导入了准时航空公司的数据库https://clickhouse.com/docs/en/getting-started/example-datasets/ontime/
然后我创建了一个字典,将 2 位飞机代码映射到公司名称,如下所示:
id,code,company
1,UA,United Airlines
2,HA,Hawaiian Airlines
3,OO,SkyWest
4,B6,Jetblue Airway
5,QX,Horizon Air
6,YX,Republic Airway
7,G4,Allegiant Air
...
..
我使用这个查询来生成它,它似乎有效:
CREATE DICTIONARY airlinecompany
(
id UInt64,
code String,
company String
)
PRIMARY KEY id
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(FLAT())
LIFETIME(3600)
在主数据库中(ontime)看起来像这样:
SELECT Reporting_Airline AS R_air
FROM ontime
GROUP BY R_air
LIMIT 4
┌─R_air─┐
│ UA │
│ HA │
│ OO │
│ B6 │
└───────┘
我想要做的是有一个使用 R_air 的 2 代码值的 table,然后根据 airlinecompany
字典检查它以创建一个映射,即
R_Air Company
UA | United Airlines
HA | Hawaiian Airlines
00 | SkyWest
...
..
但我似乎无法正确形成此查询:
SELECT
Reporting_Airline AS R_Air,
dictGetString('airlinecompany', 'company', R_Air) AS company
FROM ontime
GROUP BY R_Air
Received exception from server (version 22.3.3):
Code: 6. DB::Exception: Received from localhost:9000. DB::Exception: Cannot parse string 'UA' as UInt64: syntax error at begin of string. Note: there are toUInt64OrZero and toUInt64OrNull functions, which returns zero/NULL instead of throwing exception.: while executing 'FUNCTION dictGetString('airlinecompany' :: 1, 'company' :: 2, Reporting_Airline :: 0) -> dictGetString('airlinecompany', 'company', Reporting_Airline) String : 4'. (CANNOT_PARSE_TEXT)
我错过了什么?我不知道为什么它认为 UA 是一个 UInt64
布局 = COMPLEX_KEY_HASHED
CREATE DICTIONARY airlinecompany
(
id UInt64,
code String,
company String
)
PRIMARY KEY code
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(COMPLEX_KEY_HASHED())
LIFETIME(3600)
SELECT dictGet('airlinecompany', 'company', tuple('UA'))
┌─dictGet('airlinecompany', 'company', tuple('UA'))─┐
│ United Airlines │
└───────────────────────────────────────────────────┘
SELECT Reporting_Airline AS R_air,
dictGetString('airlinecompany', 'company', tuple(R_Air)) AS company
FROM ontime
LIMIT 4;
┌─R_Air─┬─company───────────┐
│ B6 │ Jetblue Airway │
│ G4 │ Allegiant Air │
│ HA │ Hawaiian Airlines │
│ OO │ SkyWest │
└───────┴───────────────────┘