如何向 ClickHouse 中的可执行 UDF 发送多个参数?
How to send multiple arguments to Executable UDF in ClickHouse?
我有一个 python 脚本输出输入:
#!/usr/bin/python3
import sys
if __name__ == '__main__':
i = 0
for line in sys.stdin:
print(i, line, end='')
sys.stdout.flush()
i += 1
此脚本使用此配置连接到 ClickHouse:
<functions>
<function>
<type>executable</type>
<name>test_function_python</name>
<return_type>String</return_type>
<argument><type>Int64</type></argument>
<format>TabSeparated</format>
<command>test_function.py</command>
<execute_direct>1</execute_direct>
</function>
</functions>
从 ClickHouse 调用脚本:
SELECT test_function_python(number) AS x
FROM numbers(5)
┌─x───┐
│ 0 0 │
│ 1 1 │
│ 2 2 │
│ 3 3 │
│ 4 4 │
└─────┘
到目前为止一切顺利,但我想像这样向 UDF 发送多个参数:
SELECT test_function_python(number, number + 3) AS x
FROM numbers(5)
那么如何从 Python 代码中获取两个参数??
在函数配置中使用多个 <argument>
标签(python_function.xml
in /etc/clickhouse-server
:
<functions>
<function>
<type>executable</type>
<name>test_function_python</name>
<return_type>String</return_type>
<argument><type>String</type></argument>
<argument><type>String</type></argument>
<format>TabSeparated</format>
<command>test_function.py</command>
<execute_direct>1</execute_direct>
</function>
</functions>
然后将两个参数传递给python函数:
SELECT test_function_python(number, number + 1) AS x
FROM numbers(10)
Query id: 562676d4-e7fb-4aec-86b4-fef41fec4864
┌─x─────────────────┐
│ 0: arg1=0 arg2=1 │
│ 1: arg1=1 arg2=2 │
│ 2: arg1=2 arg2=3 │
│ 3: arg1=3 arg2=4 │
│ 4: arg1=4 arg2=5 │
│ 5: arg1=5 arg2=6 │
│ 6: arg1=6 arg2=7 │
│ 7: arg1=7 arg2=8 │
│ 8: arg1=8 arg2=9 │
│ 9: arg1=9 arg2=10 │
└───────────────────┘
Python代码(test_function.py
):
#!/usr/bin/python3
import sys
if __name__ == '__main__':
i = 0
for line in sys.stdin:
arg1, arg2 = line.split('\t')
print(f'{i}: arg1={arg1} arg2={arg2}', end='')
sys.stdout.flush()
i += 1
我有一个 python 脚本输出输入:
#!/usr/bin/python3
import sys
if __name__ == '__main__':
i = 0
for line in sys.stdin:
print(i, line, end='')
sys.stdout.flush()
i += 1
此脚本使用此配置连接到 ClickHouse:
<functions>
<function>
<type>executable</type>
<name>test_function_python</name>
<return_type>String</return_type>
<argument><type>Int64</type></argument>
<format>TabSeparated</format>
<command>test_function.py</command>
<execute_direct>1</execute_direct>
</function>
</functions>
从 ClickHouse 调用脚本:
SELECT test_function_python(number) AS x
FROM numbers(5)
┌─x───┐
│ 0 0 │
│ 1 1 │
│ 2 2 │
│ 3 3 │
│ 4 4 │
└─────┘
到目前为止一切顺利,但我想像这样向 UDF 发送多个参数:
SELECT test_function_python(number, number + 3) AS x
FROM numbers(5)
那么如何从 Python 代码中获取两个参数??
在函数配置中使用多个 <argument>
标签(python_function.xml
in /etc/clickhouse-server
:
<functions>
<function>
<type>executable</type>
<name>test_function_python</name>
<return_type>String</return_type>
<argument><type>String</type></argument>
<argument><type>String</type></argument>
<format>TabSeparated</format>
<command>test_function.py</command>
<execute_direct>1</execute_direct>
</function>
</functions>
然后将两个参数传递给python函数:
SELECT test_function_python(number, number + 1) AS x
FROM numbers(10)
Query id: 562676d4-e7fb-4aec-86b4-fef41fec4864
┌─x─────────────────┐
│ 0: arg1=0 arg2=1 │
│ 1: arg1=1 arg2=2 │
│ 2: arg1=2 arg2=3 │
│ 3: arg1=3 arg2=4 │
│ 4: arg1=4 arg2=5 │
│ 5: arg1=5 arg2=6 │
│ 6: arg1=6 arg2=7 │
│ 7: arg1=7 arg2=8 │
│ 8: arg1=8 arg2=9 │
│ 9: arg1=9 arg2=10 │
└───────────────────┘
Python代码(test_function.py
):
#!/usr/bin/python3
import sys
if __name__ == '__main__':
i = 0
for line in sys.stdin:
arg1, arg2 = line.split('\t')
print(f'{i}: arg1={arg1} arg2={arg2}', end='')
sys.stdout.flush()
i += 1