如何向 ClickHouse 中的可执行 UDF 发送多个参数?

How to send multiple arguments to Executable UDF in ClickHouse?

我有一个 python 脚本输出输入:

#!/usr/bin/python3

import sys

if __name__ == '__main__':
    i = 0
    for line in sys.stdin:
        print(i, line, end='')
        sys.stdout.flush()
        i += 1

此脚本使用此配置连接到 ClickHouse:

<functions>
    <function>
        <type>executable</type>
        <name>test_function_python</name>
        <return_type>String</return_type>
        <argument><type>Int64</type></argument>
        <format>TabSeparated</format>
        <command>test_function.py</command>
        <execute_direct>1</execute_direct>
    </function>
</functions>

从 ClickHouse 调用脚本:

SELECT test_function_python(number) AS x        
FROM numbers(5)                                 
                                                
                                                
┌─x───┐                                         
│ 0 0 │                                         
│ 1 1 │                                         
│ 2 2 │                                         
│ 3 3 │                                         
│ 4 4 │                                         
└─────┘                                         

到目前为止一切顺利,但我想像这样向 UDF 发送多个参数:

SELECT test_function_python(number, number + 3) AS x        
FROM numbers(5)                                 

那么如何从 Python 代码中获取两个参数??

在函数配置中使用多个 <argument> 标签(python_function.xml in /etc/clickhouse-server:

<functions>                                     
    <function>                                  
        <type>executable</type>                 
        <name>test_function_python</name>       
        <return_type>String</return_type>       
        <argument><type>String</type></argument>
        <argument><type>String</type></argument>
        <format>TabSeparated</format>           
        <command>test_function.py</command>     
        <execute_direct>1</execute_direct>      
    </function>                                 
</functions>                                    

然后将两个参数传递给python函数:

SELECT test_function_python(number, number + 1) AS x   
FROM numbers(10)                                       
                                                       
Query id: 562676d4-e7fb-4aec-86b4-fef41fec4864         
                                                       
┌─x─────────────────┐                                  
│ 0: arg1=0 arg2=1  │                                  
│ 1: arg1=1 arg2=2  │                                  
│ 2: arg1=2 arg2=3  │                                  
│ 3: arg1=3 arg2=4  │                                  
│ 4: arg1=4 arg2=5  │                                  
│ 5: arg1=5 arg2=6  │                                  
│ 6: arg1=6 arg2=7  │                                  
│ 7: arg1=7 arg2=8  │                                  
│ 8: arg1=8 arg2=9  │                                  
│ 9: arg1=9 arg2=10 │                                  
└───────────────────┘                                  

Python代码(test_function.py):

#!/usr/bin/python3

import sys

if __name__ == '__main__':
    i = 0
    for line in sys.stdin:
        arg1, arg2 = line.split('\t')
        print(f'{i}: arg1={arg1} arg2={arg2}', end='')
        sys.stdout.flush()
        i += 1