如何使用Consul一次在一台机器上执行同步任务?
How to use Consul to perform synchronize task on one machine at a time?
我有一个有 10 台机器的系统,我需要在每台机器上按同步顺序一台一台地执行特定任务。基本上只有一台机器应该在特定时间执行该任务。我们已经将 Consul
用于其他目的,但我在想我们是否也可以使用 Consul
来做到这一点?
我阅读了更多相关信息,看起来我们可以使用 consul 进行领导选举,每台机器都会尝试获取锁,完成工作,然后释放锁。一旦工作完成,它将释放锁,然后其他机器将再次尝试获取锁并做同样的工作。这样所有的东西都会一次同步一台机器。
我决定使用这个 C#
PlayFab ConsulDotNet
library,它看起来已经内置了这个功能,但如果有更好的选择,我也愿意接受。下面Action
我的代码库中的方法几乎通过观察者机制在每台机器上同时调用。
private void Action() {
// Try to acquire lock using Consul.
// If lock acquired then DoTheWork() otherwise keep waiting for it until lock is acquired.
// Once work is done, release the lock
// so that some other machine can acquire the lock and do the same work.
}
现在在上述方法中我需要做以下事情 -
- 尝试获取锁。如果您无法获取锁,请等待它,因为其他机器可能比您先获取了它。
- 如果获得锁,则 DoTheWork()。
- 工作完成后,释放锁,以便其他机器可以获得锁并执行相同的工作。
想法是所有 10 台机器都应该 DoTheWork()
以同步顺序一次一台。基于此 blog and this blog 我决定修改他们的示例以满足我们的需要 -
下面是我的LeaderElectionService
class:
public class LeaderElectionService
{
public LeaderElectionService(string leadershipLockKey)
{
this.key = leadershipLockKey;
}
public event EventHandler<LeaderChangedEventArgs> LeaderChanged;
string key;
CancellationTokenSource cts = new CancellationTokenSource();
Timer timer;
bool lastIsHeld = false;
IDistributedLock distributedLock;
public void Start()
{
timer = new Timer(async (object state) => await TryAcquireLock((CancellationToken)state), cts.Token, 0, Timeout.Infinite);
}
private async Task TryAcquireLock(CancellationToken token)
{
if (token.IsCancellationRequested)
return;
try
{
if (distributedLock == null)
{
var clientConfig = new ConsulClientConfiguration { Address = new Uri("http://consul.host.domain.com") };
ConsulClient client = new ConsulClient(clientConfig);
distributedLock = await client.AcquireLock(new LockOptions(key) { LockTryOnce = true, LockWaitTime = TimeSpan.FromSeconds(3) }, token).ConfigureAwait(false);
}
else
{
if (!distributedLock.IsHeld)
{
await distributedLock.Acquire(token).ConfigureAwait(false);
}
}
}
catch (LockMaxAttemptsReachedException ex)
{
//this is expected if it couldn't acquire the lock within the first attempt.
Console.WriteLine(ex.Stacktrace);
}
catch (Exception ex)
{
Console.WriteLine(ex.Stacktrace);
}
finally
{
bool lockHeld = distributedLock?.IsHeld == true;
HandleLockStatusChange(lockHeld);
//Retrigger the timer after a 10 seconds delay (in this example). Delay for 7s if not held as the AcquireLock call will block for ~3s in every failed attempt.
timer.Change(lockHeld ? 10000 : 7000, Timeout.Infinite);
}
}
protected virtual void HandleLockStatusChange(bool isHeldNew)
{
// Is this the right way to check and do the work here?
// In general I want to call method "DoTheWork" in "Action" method itself
// And then release and destroy the session once work is done.
if (isHeldNew)
{
// DoTheWork();
Console.WriteLine("Hello");
// And then were should I release the lock so that other machine can try to grab it?
// distributedLock.Release();
// distributedLock.Destroy();
}
if (lastIsHeld == isHeldNew)
return;
else
{
lastIsHeld = isHeldNew;
}
if (LeaderChanged != null)
{
LeaderChangedEventArgs args = new LeaderChangedEventArgs(lastIsHeld);
foreach (EventHandler<LeaderChangedEventArgs> handler in LeaderChanged.GetInvocationList())
{
try
{
handler(this, args);
}
catch (Exception ex)
{
Console.WriteLine(ex.Stacktrace);
}
}
}
}
}
下面是我的 LeaderChangedEventArgs
class:
public class LeaderChangedEventArgs : EventArgs
{
private bool isLeader;
public LeaderChangedEventArgs(bool isHeld)
{
isLeader = isHeld;
}
public bool IsLeader { get { return isLeader; } }
}
在上面的代码中有很多部分可能不需要我的用例,但想法是一样的。
问题陈述
现在,在我的 Action
方法中,我想使用上面的 class 并在获得锁后立即执行任务,否则继续等待锁。工作完成后,释放并销毁会话,以便其他机器可以获取它并完成工作。我对如何在下面的方法中正确使用上面的 class 有点困惑。
private void Action() {
LeaderElectionService electionService = new LeaderElectionService("data/process");
// electionService.LeaderChanged += (source, arguments) => Console.WriteLine(arguments.IsLeader ? "Leader" : "Slave");
electionService.Start();
// now how do I wait for the lock to be acquired here indefinitely
// And once lock is acquired, do the work and then release and destroy the session
// so that other machine can grab the lock and do the work
}
我最近开始使用 C#
,这就是为什么我对如何使用 Consul
和这个库在生产中高效工作感到困惑。
更新
我按照你的建议尝试了下面的代码,我想我之前也试过这个,但出于某种原因,一旦它进入这一行 await distributedLock.Acquire(cancellationToken);
,它就会自动返回到 main 方法。它永远不会前进到我的 Doing Some Work!
打印出来。 CreateLock
真的有效吗?我期待它会在 consul 上创建 data/lock
(因为它不存在),然后尝试获取它的锁,如果获取了锁,那么完成工作然后为其他机器释放它?
private static CancellationTokenSource cts = new CancellationTokenSource();
public static void Main(string[] args)
{
Action(cts.Token);
Console.WriteLine("Hello World");
}
private static async Task Action(CancellationToken cancellationToken)
{
const string keyName = "data/lock";
var clientConfig = new ConsulClientConfiguration { Address = new Uri("http://consul.test.host.com") };
ConsulClient client = new ConsulClient(clientConfig);
var distributedLock = client.CreateLock(keyName);
while (true)
{
try
{
// Try to acquire lock
// As soon as it comes to this line,
// it just goes back to main method automatically. not sure why
await distributedLock.Acquire(cancellationToken);
// Lock is acquired
// DoTheWork();
Console.WriteLine("Doing Some Work!");
// Work is done. Jump out of loop to release the lock
break;
}
catch (LockHeldException)
{
// Cannot acquire the lock. Wait a while then retry
await Task.Delay(TimeSpan.FromSeconds(10), cancellationToken);
}
catch (Exception)
{
// TODO: Handle exception thrown by DoTheWork method
// Here we jump out of the loop to release the lock
// But you can try to acquire the lock again based on your requirements
break;
}
}
// Release and destroy the lock
// So that other machine can grab the lock and do the work
await distributedLock.Release(cancellationToken);
await distributedLock.Destroy(cancellationToken);
}
IMO,这些博客中的 LeaderElectionService
对你来说太过分了。
更新 1
不需要执行 while
循环,因为:
ConsulClient
是局部变量
- 不用检查
IsHeld
属性
Acquire
将无限期阻塞,除非
- 在
LockOptions
中设置 LockTryOnce
为真
- 将超时设置为
CancellationToken
旁注,在对分布式锁 (reference) 调用 Release
后,无需调用 Destroy
方法。
private async Task Action(CancellationToken cancellationToken)
{
const string keyName = "YOUR_KEY";
var client = new ConsulClient();
var distributedLock = client.CreateLock(keyName);
try
{
// Try to acquire lock
// NOTE:
// Acquire method will block indefinitely unless
// 1. Set LockTryOnce = true in LockOptions
// 2. Pass a timeout to cancellation token
await distributedLock.Acquire(cancellationToken);
// Lock is acquired
DoTheWork();
}
catch (Exception)
{
// TODO: Handle exception thrown by DoTheWork method
}
// Release the lock (not necessary to invoke Destroy method),
// so that other machine can grab the lock and do the work
await distributedLock.Release(cancellationToken);
}
更新 2
OP 的代码 returns 返回 Main
方法的原因是 Action
方法 未等待 。如果您使用 C# 7.1,则可以使用 async Main,并将 await
放在 Action
方法上。
public static async Task Main(string[] args)
{
await Action(cts.Token);
Console.WriteLine("Hello World");
}
我有一个有 10 台机器的系统,我需要在每台机器上按同步顺序一台一台地执行特定任务。基本上只有一台机器应该在特定时间执行该任务。我们已经将 Consul
用于其他目的,但我在想我们是否也可以使用 Consul
来做到这一点?
我阅读了更多相关信息,看起来我们可以使用 consul 进行领导选举,每台机器都会尝试获取锁,完成工作,然后释放锁。一旦工作完成,它将释放锁,然后其他机器将再次尝试获取锁并做同样的工作。这样所有的东西都会一次同步一台机器。
我决定使用这个 C#
PlayFab ConsulDotNet
library,它看起来已经内置了这个功能,但如果有更好的选择,我也愿意接受。下面Action
我的代码库中的方法几乎通过观察者机制在每台机器上同时调用。
private void Action() {
// Try to acquire lock using Consul.
// If lock acquired then DoTheWork() otherwise keep waiting for it until lock is acquired.
// Once work is done, release the lock
// so that some other machine can acquire the lock and do the same work.
}
现在在上述方法中我需要做以下事情 -
- 尝试获取锁。如果您无法获取锁,请等待它,因为其他机器可能比您先获取了它。
- 如果获得锁,则 DoTheWork()。
- 工作完成后,释放锁,以便其他机器可以获得锁并执行相同的工作。
想法是所有 10 台机器都应该 DoTheWork()
以同步顺序一次一台。基于此 blog and this blog 我决定修改他们的示例以满足我们的需要 -
下面是我的LeaderElectionService
class:
public class LeaderElectionService
{
public LeaderElectionService(string leadershipLockKey)
{
this.key = leadershipLockKey;
}
public event EventHandler<LeaderChangedEventArgs> LeaderChanged;
string key;
CancellationTokenSource cts = new CancellationTokenSource();
Timer timer;
bool lastIsHeld = false;
IDistributedLock distributedLock;
public void Start()
{
timer = new Timer(async (object state) => await TryAcquireLock((CancellationToken)state), cts.Token, 0, Timeout.Infinite);
}
private async Task TryAcquireLock(CancellationToken token)
{
if (token.IsCancellationRequested)
return;
try
{
if (distributedLock == null)
{
var clientConfig = new ConsulClientConfiguration { Address = new Uri("http://consul.host.domain.com") };
ConsulClient client = new ConsulClient(clientConfig);
distributedLock = await client.AcquireLock(new LockOptions(key) { LockTryOnce = true, LockWaitTime = TimeSpan.FromSeconds(3) }, token).ConfigureAwait(false);
}
else
{
if (!distributedLock.IsHeld)
{
await distributedLock.Acquire(token).ConfigureAwait(false);
}
}
}
catch (LockMaxAttemptsReachedException ex)
{
//this is expected if it couldn't acquire the lock within the first attempt.
Console.WriteLine(ex.Stacktrace);
}
catch (Exception ex)
{
Console.WriteLine(ex.Stacktrace);
}
finally
{
bool lockHeld = distributedLock?.IsHeld == true;
HandleLockStatusChange(lockHeld);
//Retrigger the timer after a 10 seconds delay (in this example). Delay for 7s if not held as the AcquireLock call will block for ~3s in every failed attempt.
timer.Change(lockHeld ? 10000 : 7000, Timeout.Infinite);
}
}
protected virtual void HandleLockStatusChange(bool isHeldNew)
{
// Is this the right way to check and do the work here?
// In general I want to call method "DoTheWork" in "Action" method itself
// And then release and destroy the session once work is done.
if (isHeldNew)
{
// DoTheWork();
Console.WriteLine("Hello");
// And then were should I release the lock so that other machine can try to grab it?
// distributedLock.Release();
// distributedLock.Destroy();
}
if (lastIsHeld == isHeldNew)
return;
else
{
lastIsHeld = isHeldNew;
}
if (LeaderChanged != null)
{
LeaderChangedEventArgs args = new LeaderChangedEventArgs(lastIsHeld);
foreach (EventHandler<LeaderChangedEventArgs> handler in LeaderChanged.GetInvocationList())
{
try
{
handler(this, args);
}
catch (Exception ex)
{
Console.WriteLine(ex.Stacktrace);
}
}
}
}
}
下面是我的 LeaderChangedEventArgs
class:
public class LeaderChangedEventArgs : EventArgs
{
private bool isLeader;
public LeaderChangedEventArgs(bool isHeld)
{
isLeader = isHeld;
}
public bool IsLeader { get { return isLeader; } }
}
在上面的代码中有很多部分可能不需要我的用例,但想法是一样的。
问题陈述
现在,在我的 Action
方法中,我想使用上面的 class 并在获得锁后立即执行任务,否则继续等待锁。工作完成后,释放并销毁会话,以便其他机器可以获取它并完成工作。我对如何在下面的方法中正确使用上面的 class 有点困惑。
private void Action() {
LeaderElectionService electionService = new LeaderElectionService("data/process");
// electionService.LeaderChanged += (source, arguments) => Console.WriteLine(arguments.IsLeader ? "Leader" : "Slave");
electionService.Start();
// now how do I wait for the lock to be acquired here indefinitely
// And once lock is acquired, do the work and then release and destroy the session
// so that other machine can grab the lock and do the work
}
我最近开始使用 C#
,这就是为什么我对如何使用 Consul
和这个库在生产中高效工作感到困惑。
更新
我按照你的建议尝试了下面的代码,我想我之前也试过这个,但出于某种原因,一旦它进入这一行 await distributedLock.Acquire(cancellationToken);
,它就会自动返回到 main 方法。它永远不会前进到我的 Doing Some Work!
打印出来。 CreateLock
真的有效吗?我期待它会在 consul 上创建 data/lock
(因为它不存在),然后尝试获取它的锁,如果获取了锁,那么完成工作然后为其他机器释放它?
private static CancellationTokenSource cts = new CancellationTokenSource();
public static void Main(string[] args)
{
Action(cts.Token);
Console.WriteLine("Hello World");
}
private static async Task Action(CancellationToken cancellationToken)
{
const string keyName = "data/lock";
var clientConfig = new ConsulClientConfiguration { Address = new Uri("http://consul.test.host.com") };
ConsulClient client = new ConsulClient(clientConfig);
var distributedLock = client.CreateLock(keyName);
while (true)
{
try
{
// Try to acquire lock
// As soon as it comes to this line,
// it just goes back to main method automatically. not sure why
await distributedLock.Acquire(cancellationToken);
// Lock is acquired
// DoTheWork();
Console.WriteLine("Doing Some Work!");
// Work is done. Jump out of loop to release the lock
break;
}
catch (LockHeldException)
{
// Cannot acquire the lock. Wait a while then retry
await Task.Delay(TimeSpan.FromSeconds(10), cancellationToken);
}
catch (Exception)
{
// TODO: Handle exception thrown by DoTheWork method
// Here we jump out of the loop to release the lock
// But you can try to acquire the lock again based on your requirements
break;
}
}
// Release and destroy the lock
// So that other machine can grab the lock and do the work
await distributedLock.Release(cancellationToken);
await distributedLock.Destroy(cancellationToken);
}
IMO,这些博客中的 LeaderElectionService
对你来说太过分了。
更新 1
不需要执行 while
循环,因为:
ConsulClient
是局部变量- 不用检查
IsHeld
属性
- 不用检查
Acquire
将无限期阻塞,除非- 在
LockOptions
中设置 - 将超时设置为
CancellationToken
LockTryOnce
为真- 在
旁注,在对分布式锁 (reference) 调用 Release
后,无需调用 Destroy
方法。
private async Task Action(CancellationToken cancellationToken)
{
const string keyName = "YOUR_KEY";
var client = new ConsulClient();
var distributedLock = client.CreateLock(keyName);
try
{
// Try to acquire lock
// NOTE:
// Acquire method will block indefinitely unless
// 1. Set LockTryOnce = true in LockOptions
// 2. Pass a timeout to cancellation token
await distributedLock.Acquire(cancellationToken);
// Lock is acquired
DoTheWork();
}
catch (Exception)
{
// TODO: Handle exception thrown by DoTheWork method
}
// Release the lock (not necessary to invoke Destroy method),
// so that other machine can grab the lock and do the work
await distributedLock.Release(cancellationToken);
}
更新 2
OP 的代码 returns 返回 Main
方法的原因是 Action
方法 未等待 。如果您使用 C# 7.1,则可以使用 async Main,并将 await
放在 Action
方法上。
public static async Task Main(string[] args)
{
await Action(cts.Token);
Console.WriteLine("Hello World");
}