Lagom 服务使用来自 Kafka 的输入

Lagom service consuming input from Kafka

我想弄清楚如何使用 Lagom 来消费来自通过 Kafka 通信的外部系统的数据。

我已经 运行 了解这个 section of Lagom documentation,它描述了 Lagom 服务如何通过订阅其主题与另一个 Lagom 服务进行通信。

helloService
  .greetingsTopic()
  .subscribe // <-- you get back a Subscriber instance
  .atLeastOnce(
  Flow.fromFunction(doSomethingWithTheMessage)
)

但是,当您想订阅包含某些 运行dom、外部系统产生的事件的 Kafka 主题时,什么是合适的配置?

此功能需要某种适配器吗? 澄清一下,我现在有这个:

object Aggregator {
  val TOPIC_NAME = "my-aggregation"
}

trait Aggregator extends Service {
  def aggregate(correlationId: String): ServiceCall[Data, Done]

  def aggregationTopic(): Topic[DataRecorded]

  override final def descriptor: Descriptor = {
    import Service._

    named("aggregator")
      .withCalls(
        pathCall("/api/aggregate/:correlationId", aggregate _)
      )
      .withTopics(
        topic(Aggregator.TOPIC_NAME, aggregationTopic())
          .addProperty(
            KafkaProperties.partitionKeyStrategy,
            PartitionKeyStrategy[DataRecorded](_.sessionData.correlationId)
          )
      )
      .withAutoAcl(true)
  }
}

我可以通过简单的 POST 请求调用它。 但是,我希望通过使用来自某些(外部)Kafka 主题的 Data 消息来调用它。

我想知道是否有这种方法可以以类似于此模型的方式配置描述符:

override final def descriptor: Descriptor = {
  ...
  kafkaTopic("my-input-topic")
    .subscribe(serviceCall(aggregate _)
    .withAtMostOnceDelivery
}

我已经 运行 参与了这个 discussion on Google Groups,但是在 OP 问题中,我没有看到他实际上对来自 some-topicEventMessage 做了任何事情除了将它们路由到他的服务定义的主题。

编辑 #1:进度更新

查看文档,我决定尝试以下方法。 我又添加了 2 个模块,aggregator-kafka-proxy-apiaggregator-kafka-proxy-impl.

在新的 api 模块中,我定义了一个新服务,没有方法,但有一个主题代表我的 Kafka 主题:

object DataKafkaPublisher {
  val TOPIC_NAME = "data-in"
}

trait DataKafkaPublisher extends Service {
  def dataInTopic: Topic[DataPublished]

  override final def descriptor: Descriptor = {
    import Service._
    import DataKafkaPublisher._

    named("data-kafka-in")
      .withTopics(
        topic(TOPIC_NAME, dataInTopic)
          .addProperty(
            KafkaProperties.partitionKeyStrategy,
            PartitionKeyStrategy[SessionDataPublished](_.data.correlationId)
          )
      )
      .withAutoAcl(true)
  }
}

在impl模块中,我简单做了标准实现

class DataKafkaPublisherImpl(persistentEntityRegistry: PersistentEntityRegistry) extends DataKafkaPublisher {
  override def dataInTopic: Topic[api.DataPublished] =
    TopicProducer.singleStreamWithOffset {
      fromOffset =>
        persistentEntityRegistry.eventStream(KafkaDataEvent.Tag, fromOffset)
          .map(ev => (convertEvent(ev), ev.offset))
    }

  private def convertEvent(evt: EventStreamElement[KafkaDataEvent]): api.DataPublished = {
    evt.event match {
      case DataPublished(data) => api.DataPublished(data)
    }
  }
}

现在,为了实际使用这些事件,在我的 aggregator-impl 模块中,我添加了一个 "subscriber" 服务,它接收这些事件,并在实体上调用适当的命令。

class DataKafkaSubscriber(persistentEntityRegistry: PersistentEntityRegistry, kafkaPublisher: DataKafkaPublisher) {

  kafkaPublisher.dataInTopic.subscribe.atLeastOnce(
    Flow[DataPublished].mapAsync(1) { sd =>
      sessionRef(sd.data.correlationId).ask(RecordData(sd.data))
    }
  )

  private def sessionRef(correlationId: String) =
    persistentEntityRegistry.refFor[Entity](correlationId)
}

这有效地允许我在 Kafka 主题 "data-in" 上发布一条消息,然后在发送给实体使用之前代理并转换为 RecordData 命令。

然而,这对我来说似乎有点老套。我通过 Lagom 内部结构耦合到 Kafka。我无法轻易交换数据源。例如,如果我愿意,我将如何使用来自 RabbitMQ 的外部消息? 如果我试图从另一个 Kafka(与 Lagom 使用的不同)消费怎么办?

编辑 #2:更多文档

我在 Lagom 文档上找到了几篇文章,特别是这篇文章:

Consuming Topics from 3rd parties

You may want your Lagom service to consume data produced on services not implemented in Lagom. In that case, as described in the Service Clients section, you can create a third-party-service-api module in your Lagom project. That module will contain a Service Descriptor declaring the topic you will consume from. Once you have your ThirdPartyService interface and related classes implemented, you should add third-party-service-api as a dependency on your fancy-service-impl. Finally, you can consume from the topic described in ThirdPartyService as documented in the Subscribe to a topic section.

我不使用 lagom 所以这可能只是一个想法。但是由于 akka-streamslagom 的一部分(至少我是这么认为的)——从这个解决方案得到你需要的东西应该很容易。

我用了 akka-stream-kafka 这真的很好(我只做了一个原型)

当您使用消息时,您会做一些事情:

     Consumer
      .committableSource(
          consumerSettings(..), // config of Kafka
          Subscriptions.topics("kafkaWsPathMsgTopic")) // Topic to subscribe
      .mapAsync(10) { msg =>
        business(msg.record) // do something
      }

检查写得好documentation

你可以在这里找到我的整个例子: PathMsgConsumer

A​​lan Klikic 在 Lightbend 论坛上提供了一个答案 here

第 1 部分:

If you are only using external Kafka cluster in your business service then you can implement this using only Lagom Broker API. So you need to:

  1. create API with service descriptor with only topic definition (this API is not beeing implemented)
  2. in your business service configure kafka_native depending on your deployment (as i mentioned in previous post)
  3. in your business service inject service from API created in #1 and subscribe to it using Lagom Broker API subscriber

Offset commiting, in Lagom Broker API subscriber is handled out-of-the-box.

第 2 部分:

Kafka and AMQP consumer implementations require persistant akka stream. So you need to handle disconnects. These can be done in two ways:

  1. control peristant akka stream by wraping it in an actor. You initialize you stream Flow on actor preStart and pipe stream complete to the actor that will stop it. If stream completes or fails actor will stop. Then wrap actor in actor backoff with restart strategy, that will restart the actor in case of complete or fail and reinitialize the Flow
  2. akka streams Delayed restarts with backoff stage

Personnaly I use #1 and did not try #2 yet.

Initializing backoff actor for #1 or Flow for #2 can be done in your Lagom components trait (basically in the same place where you do your subscribe now using Lagom Broker API).

Be sure to set a consumer group when configuring consumer to ensure avoiding duplicate consuming. You can use, like Lagom does, service name from descriptor as consumer group name.