KUDU 用于 JDBC 复制目的,但不用于卸载分析

KUDU for JDBC replication purposes, but not for Off-loaded Analytics

引用Apache KUDU官方文档,即:https://kudu.apache.org/overview.html

Kudu isn't designed to be an OLTP system, but if you have some subset of data which fits in memory, it offers competitive random access performance. We've measured 99th percentile latencies of 6ms or below using YCSB with a uniform random access workload over a billion rows. Being able to run low-latency online workloads on the same storage as back-end data analytics can dramatically simplify application architecture.

此声明是否暗示 KUDU 可用于从 JDBC 源复制 - 可能的最简单形式?

在其他地方,我使用 KUDU 从 SAP 和其他 COTS 复制到,因此报告可以 运行 针对 KUDU 表而不是 Hana。那是别人决定的架构。

对于数据的纯复制,主要是为了后续从数据湖中提取,对于大小<1TB的带有修饰历史的数据,这也是可行的。 Cloudera 经过讨论确认了这一点。尽管 KUDU 具有列格式和行格式是可取的,但它也可以正常工作。