在没有 pg_read_file 的情况下使用 PostgreSQL 读取二进制文件

Read binary file with PostgreSQL without pg_read_file

我需要使用清晰的 PostgreSQL 9.4 解决以下问题:

编辑:

该文件在集群路径之外,因此正常函数会引发:
SQL 错误:错误:绝对路径不允许

经过大量研究,我得出了以下功能:

  CREATE OR REPLACE FUNCTION file_read(file text)  
  RETURNS bytea AS $$
    DECLARE
      content text;
      tmp text;
    BEGIN
      file := quote_literal(file);
      tmp := quote_ident(md5(random()::text));

      -- create tmp table using random name
      EXECUTE 'CREATE TEMP TABLE ' || tmp || ' (id oid, file_name text, content bytea)';
      
      -- add given filename
      EXECUTE 'INSERT INTO '|| tmp ||' (file_name) VALUES('|| file ||')';
            
      -- add the document to large object storage and return the link id
      BEGIN
           EXECUTE 'UPDATE ' || tmp || ' SET id = lo_import(file_name) ';
      EXCEPTION WHEN OTHERS THEN
           RETURN NULL;
      END;
            
      -- pull document from large object storage
      EXECUTE 'UPDATE ' || tmp || ' SET content = lo_get(id) ';
      
      -- delete the file from large object storage
      EXECUTE 'SELECT lo_unlink(id) FROM ' || tmp;
      
      -- save data to content variable
      EXECUTE 'SELECT content FROM ' || tmp INTO content;
      
      -- drop tmp table      
      EXECUTE 'DROP TABLE ' || tmp;

      -- return 
      RETURN content;
    END;
  $$ LANGUAGE plpgsql VOLATILE;

示例用例:

从文件读取
select file_read(concat('/tmp/', '28528026bc302546d17ce7e82400ab7e.zip')

更新列
update custom_table set content = file_read(filename)

这是一个简单的函数get_file_contents(filename text) returns bytea

create or replace function get_file_contents(filename text) returns bytea as
$fn$
 declare 
    lo_oid oid;
    retval bytea;
 begin
    lo_oid := lo_import(filename);
    retval := lo_get(lo_oid);
    perform lo_unlink(lo_oid);
    return retval;
 end;
$fn$ language plpgsql;
  • 简单用法
-- Read the great work of Sun Tzu
select get_file_contents('/media/data/ForeignData/The Art Of War.pdf');

-- Insert into a table, update a table
insert into mytable (mycolumn[,<others>]) values (get_file_contents(myfilename)[,<others>]);
update mytable set mycolumn = get_file_contents(myfilename) where <whatever there>;

使用内置函数pg_read_binary_file()。它从 Postgres 9.1 开始可用,并且可以完全满足您的需求。 The manual:

Returns all or part of a file. This function is identical to pg_read_file except that it can read arbitrary binary data, returning the result as bytea not text; accordingly, no encoding checks are performed.

This function is restricted to superusers by default, but other users can be granted EXECUTE to run the function.

所以...

Read a zipped file from server into a bytea column

UPDATE custom_table
SET    content = pg_read_binary_file('/tmp/28528026bc302546d17ce7e82400ab7e.zip')
WHERE  id = 123;

比任何解决方法都快得多。

注意这个限制,引用 the manual:

Only files within the database cluster directory and the log_directory can be accessed. Use a relative path for files in the cluster directory, and a path matching the log_directory configuration setting for log files.

您可以使用从 db 目录到任何其他目录的符号链接来克服路径限制。不过,请注意可能的安全隐患。参见:

  • Read data from a text file inside a trigger

另外,考虑升级到当前版本的 Postgres Postgres 9.4 has reached EOL on February 13, 2020