Dart 中的批处理期货

Batching futures in Dart

我想将许多期货分批到一个请求中,该请求在达到最大批量大小时触发,或者在达到最早的期货接收后的最长时间后触发。

动机

在 flutter 中,我有很多 UI 元素需要显示未来的结果,依赖于 UI 元素中的数据。

例如,我有一个地方小部件,以及一个显示步行到某个地方需要多长时间的子小部件。为了计算步行需要多长时间,我向 Google 地图 API 发出请求以获取到达该地点的行程时间。

将所有这些 API 请求分批处理成批 API 请求会更高效、更划算。因此,如果小部件瞬间发出 100 个请求,则可以通过单个提供程序代理未来,它将未来分批发送到 Google 的单个请求,并将来自 Google 的结果解压缩到所有个人要求。

提供者需要知道何时停止等待更多期货以及何时实际发出请求,这应该由最大 "batch" 大小(即旅行时间请求的数量)或您愿意等待批处理发生的最长时间。

所需的 API 类似于:


// Client gives this to tell provider how to compute batch result.
abstract class BatchComputer<K,V> {
  Future<List<V>> compute(List<K> batchedInputs);
}

// Batching library returns an object with this interface
// so that client can submit inputs to completed by the Batch provider.
abstract class BatchingFutureProvider<K,V> {
  Future<V> submit(K inputValue);
}

// How do you implement this in dart???
BatchingFutureProvider<K,V> create<K,V>(
   BatchComputer<K,V> computer, 
   int maxBatchSize, 
   Duration maxWaitDuration,
);

Dart(或 pub 包)是否已经提供了这种批处理功能,如果没有,您将如何实现上面的 create 功能?

这听起来很有道理,但也很专业。 您需要一种方法来表示查询,将这些查询组合成一个超级查询,然后将超级结果拆分为单独的结果,这就是您的 BatchComputer 所做的。然后你需要一个队列,你可以在某些条件下刷新它。

有一点很清楚,您将需要使用 Completer 来获得结果,因为当您想要 return 未来时,您总是需要它,然后才能获得价值或未来完成它。

我会选择的方法是:

import "dart:async";

/// A batch of requests to be handled together.
///
/// Collects [Request]s until the pending requests are flushed.
/// Requests can be flushed by calling [flush] or by configuring
/// the batch to automatically flush when reaching certain 
/// tresholds.
class BatchRequest<Request, Response> {
  final int _maxRequests;
  final Duration _maxDelay;
  final Future<List<Response>> Function(List<Request>) _compute;
  Timer _timeout;
  List<Request> _pendingRequests;
  List<Completer<Response>> _responseCompleters;

  /// Creates a batcher of [Request]s.
  ///
  /// Batches requests until calling [flush]. At that pont, the
  /// [batchCompute] function gets the list of pending requests,
  /// and it should respond with a list of [Response]s.
  /// The response to the a request in the argument list
  /// should be at the same index in the response list, 
  /// and as such, the response list must have the same number
  /// of responses as there were requests.
  ///
  /// If [maxRequestsPerBatch] is supplied, requests are automatically
  /// flushed whenever there are that many requests pending.
  ///
  /// If [maxDelay] is supplied, requests are automatically flushed 
  /// when the oldest request has been pending for that long. 
  /// As such, The [maxDelay] is not the maximal time before a request
  /// is answered, just how long sending the request may be delayed.
  BatchRequest(Future<List<Response>> Function(List<Request>) batchCompute,
               {int maxRequestsPerBatch, Duration maxDelay})
    : _compute = batchCompute,
      _maxRequests = maxRequestsPerBatch,
      _maxDelay = maxDelay;

  /// Add a request to the batch.
  ///
  /// The request is stored until the requests are flushed,
  /// then the returned future is completed with the result (or error)
  /// received from handling the requests.
  Future<Response> addRequest(Request request) {
    var completer = Completer<Response>();
    (_pendingRequests ??= []).add(request);
    (_responseCompleters ??= []).add(completer);
    if (_pendingRequests.length == _maxRequests) {
      _flush();
    } else if (_timeout == null && _maxDelay != null) {
      _timeout = Timer(_maxDelay, _flush);
    }
    return completer.future;
  }

  /// Flush any pending requests immediately.
  void flush() {
    _flush();
  }

  void _flush() {
    if (_pendingRequests == null) {
      assert(_timeout == null);
      assert(_responseCompleters == null);
      return;
    }
    if (_timeout != null) {
      _timeout.cancel();
      _timeout = null;
    }
    var requests = _pendingRequests;
    var completers = _responseCompleters;
    _pendingRequests = null;
    _responseCompleters = null;

    _compute(requests).then((List<Response> results) {
      if (results.length != completers.length) {
        throw StateError("Wrong number of results. "
           "Expected ${completers.length}, got ${results.length}");
      }
      for (int i = 0; i < results.length; i++) {
        completers[i].complete(results[i]);
      }
    }).catchError((error, stack) {
      for (var completer in completers) {
        completer.completeError(error, stack);
      }
    });
  }
}

您可以使用它,例如:

void main() async {
  var b = BatchRequest<int, int>(_compute, 
      maxRequestsPerBatch: 5, maxDelay: Duration(seconds: 1));
  var sw = Stopwatch()..start();
  for (int i = 0; i < 8; i++) {
    b.addRequest(i).then((r) {
      print("${sw.elapsedMilliseconds.toString().padLeft(4)}: $i -> $r");
    });
  }
}
Future<List<int>> _compute(List<int> args) => 
    Future.value([for (var x in args) x + 1]);

https://pub.dev/packages/batching_future/versions/0.0.2

我和@lrn 的答案几乎一模一样,但已经努力使主线同步,并添加了一些文档。

/// Exposes [createBatcher] which batches computation requests until either
/// a max batch size or max wait duration is reached.
///
import 'dart:async';

import 'dart:collection';

import 'package:quiver/iterables.dart';
import 'package:synchronized/synchronized.dart';

/// Converts input type [K] to output type [V] for every item in
/// [batchedInputs]. There must be exactly one item in output list for every
/// item in input list, and assumes that input[i] => output[i].
abstract class BatchComputer<K, V> {
  const BatchComputer();
  Future<List<V>> compute(List<K> batchedInputs);
}

/// Interface to submit (possible) batched computation requests.
abstract class BatchingFutureProvider<K, V> {
  Future<V> submit(K inputValue);
}

/// Returns a batcher which computes transformations in batch using [computer].
/// The batcher will wait to compute until [maxWaitDuration] is reached since
/// the first item in the current batch is received, or [maxBatchSize] items
/// are in the current batch, whatever happens first.
/// If [maxBatchSize] or [maxWaitDuration] is null, then the triggering
/// condition is ignored, but at least one condition must be supplied.
///
/// Warning: If [maxWaitDuration] is not supplied, then it is possible that
/// a partial batch will never finish computing.
BatchingFutureProvider<K, V> createBatcher<K, V>(BatchComputer<K, V> computer,
    {int maxBatchSize, Duration maxWaitDuration}) {
  if (!((maxBatchSize != null || maxWaitDuration != null) &&
      (maxWaitDuration == null || maxWaitDuration.inMilliseconds > 0) &&
      (maxBatchSize == null || maxBatchSize > 0))) {
    throw ArgumentError(
        "At least one of {maxBatchSize, maxWaitDuration} must be specified and be positive values");
  }
  return _Impl(computer, maxBatchSize, maxWaitDuration);
}

// Holds the input value and the future to complete it.
class _Payload<K, V> {
  final K k;
  final Completer<V> completer;

  _Payload(this.k, this.completer);
}

enum _ExecuteCommand { EXECUTE }

/// Implements [createBatcher].
class _Impl<K, V> implements BatchingFutureProvider<K, V> {
  /// Queues computation requests.
  final controller = StreamController<dynamic>();

  /// Queues the input values with their futures to complete.
  final queue = Queue<_Payload>();

  /// Locks access to [listen] to make queue-processing single-threaded.
  final lock = Lock();

  /// [maxWaitDuration] timer, as a stored reference to cancel early if needed.
  Timer timer;

  /// Performs the input->output batch transformation.
  final BatchComputer computer;

  /// See [createBatcher].
  final int maxBatchSize;

  /// See [createBatcher].
  final Duration maxWaitDuration;
  _Impl(this.computer, this.maxBatchSize, this.maxWaitDuration) {
    controller.stream.listen(listen);
  }

  void dispose() {
    controller.close();
  }

  @override
  Future<V> submit(K inputValue) {
    final completer = Completer<V>();
    controller.add(_Payload(inputValue, completer));
    return completer.future;
  }

  // Synchronous event-processing logic.
  void listen(dynamic event) async {
    await lock.synchronized(() {
      if (event.runtimeType == _ExecuteCommand) {
        if (timer?.isActive ?? true) {
          // The timer got reset, so ignore this old request.
          // The current timer needs to inactive and non-null
          // for the execution to be legitimate.
          return;
        }
        execute();
      } else {
        addPayload(event as _Payload);
      }
      return;
    });
  }

  void addPayload(_Payload _payload) {
    if (queue.isEmpty && maxWaitDuration != null) {
      // This is the first item of the batch.
      // Trigger the timer so we are guaranteed to start computing
      // this batch before [maxWaitDuration].
      timer = Timer(maxWaitDuration, triggerTimer);
    }
    queue.add(_payload);
    if (maxBatchSize != null && queue.length >= maxBatchSize) {
      execute();
      return;
    }
  }

  void execute() async {
    timer?.cancel();
    if (queue.isEmpty) {
      return;
    }
    final results = await computer.compute(List<K>.of(queue.map((p) => p.k)));
    for (var pair in zip<Object>([queue, results])) {
      (pair[0] as _Payload).completer.complete(pair[1] as V);
    }
    queue.clear();
  }

  void triggerTimer() {
    listen(_ExecuteCommand.EXECUTE);
  }
}