难道真正的Hadoop框架不适合实时运行？

Is it real Hadoop framework is not suitable for real-time operation?

我在博客中读到

Hadoop is batch processing centric ideal for the discovery, exploration and analysis of large amounts of multi-structured data that doesn’t fit nicely into table, and not suitable for real-time operations.

所以，任何人都可以通过对此提供更好的解释来帮助我，比如为什么它不适合实时操作。 TQ

Hadoop MapReduce 不适合实时处理。

但现在，情况正在改变。例如，Storm, Spark 提供近乎实时的处理能力。

Spark 在内存计算中使用以实现更快的处理速度。它使用RDD(Resilient Distributed Dataset)作为内存抽象。

Storm 使用 spouts(sources) 和 bolts(sinks) 的 DAG。这称为拓扑和拓扑保持运行。即，它从喷口获取数据并提供给 bolts.Bolts 可以将此数据写入数据库或使其可供用户使用。这减少了处理时间。

对于实时处理，您有 HBase，它是 Hadoop 生态系统的一部分：

http://hbase.apache.org/

Apache HBase is the Hadoop database, a distributed, scalable, big data store.

When Would I Use Apache HBase?

Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.

Features

Linear and modular scalability.

List item

Strictly consistent reads and writes.

Automatic and configurable sharding of tables

Automatic failover support between RegionServers.

Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.

Easy to use Java API for client access.

Block cache and Bloom Filters for real-time queries.

Query predicate push down via server side Filters

Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options

Extensible jruby-based (JIRB) shell

Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX

它还支持原子计数器，这是 HBase 的最强点之一，可以帮助您减少对大型分析作业的需求（通过仔细和计划的行键和模式设计）。

难道真正的Hadoop框架不适合实时运行？

Is it real Hadoop framework is not suitable for real-time operation?

frameworks

hadoop

hbase

mapreduce

hdfs