Javascript Workers - 为什么这么晚才处理 worker 消息,我可以做些什么来反对它吗?
Javascript Workers - why is the worker message treated so lately and can I do something against it?
我有一个 Worker 与“主线程”共享一个 SharedArrayBuffer。为了正常工作,我必须确保工作人员在主线程访问 SAB 之前可以访问它。 (编辑:创建工作人员的代码必须在一个单独的函数中(编辑 2:returns 一个指向 SAB 的数组。)(也许,这已经不可能了,你会告诉我)。
初始代码如下所示:
function init() {
var code = `onmessage = function(event) {
console.log('starting');
var buffer=event.data;
var arr = new Uint32Array(buffer);// I need to have this done before accessing the buffer again from the main
//some other code, manipulating the array
}`
var buffer = new SharedArrayBuffer(BUFFER_ELEMENT_SIZE);
var blob = new Blob([code], { "type": 'application/javascript' });
var url = window.URL || window.webkitURL;
var blobUrl = url.createObjectURL(blob);
var counter = new Worker(blobUrl);
counter.postMessage(buffer);
let res = new Uint32Array(buffer);
return res;
}
function test (){
let array = init();
console.log('main');
//accessing the SAB again
};
worker代码总是在test()
之后执行,控制台总是显示main
,然后starting
.
使用超时没有帮助。考虑 test
的以下代码:
function test (){
let array = [];
console.log('main');
setTimeout(function(){
array = initSAB();
},0);
setTimeout(function(){
console.log('main');
//accessing the SAB again
},0);
console.log('end');
};
控制台先显示end
,然后是main
,然后是starting
。
但是,即使没有超时,将缓冲区分配给 test() 函数外部的全局数组也能完成这项工作。
我的问题如下:
- 为什么在消息发送(= 收到?)后 worker 没有直接启动。据我所知,工人有自己的事件队列,所以他们不应该依赖于主堆栈变空?
- 是否有详细说明工作人员在发送消息后何时开始工作?
- 有没有办法在不使用全局变量的情况下确保worker在再次访问SAB之前已经启动? (可以使用busy waiting,但我要小心...)可能没有办法,但我想确定。
编辑
更准确地说:
- 在完全并行的 运行 场景中,Worker 将能够
posted 后立即处理消息。这显然是
不是这样的。
- 大多数浏览器API(Worker 就是这样一个API)使用回调队列来处理对API 的调用。但如果这适用,消息将是
posted/handled 在执行超时回调之前。
- 更进一步:如果我尝试在post消息之后忙着等待,通过从 SAB 读取消息直到它更改一个值 将阻止
无限编程。对我来说,这意味着 浏览器 可以
不post发送消息直到调用栈为空至于
我知道,这种行为没有记录,我无法解释。
总结一下:我想知道如果 postMessage 的调用在内部,浏览器如何确定何时 post 消息并由 worker 处理它一个函数。 我已经找到了一个解决方法(全局变量),所以我更感兴趣的是它在幕后是如何工作的。但如果有人能给我一个工作示例,我会接受它。
编辑 2:
使用全局变量的代码(运行良好的代码)如下所示
function init() {
//Unchanged
}
var array = init(); //global
function test (){
console.log('main');
//accessing the SAB again
};
它打印 starting
,然后 main
到控制台。
同样值得注意的是:如果我用 Firefox 浏览器调试代码(Chrome 未测试)我得到我想要的结果没有全局变量(starting
在main
之前)有人可以解释一下吗?
根据 MDN:
Data passed between the main page and workers is copied, not shared. Objects are serialized as they're handed to the worker, and subsequently, de-serialized on the other end. The page and worker do not share the same instance, so the end result is that a duplicate is created on each end. Most browsers implement this feature as structured cloning.
阅读更多关于 transferring data to and from workers
这是与工作人员共享缓冲区的基本代码。它创建一个偶数 (i*2)
的数组,并将其发送给工作人员。它使用 Atomic operations 来更改缓冲区值。
要确保工作器已启动,您可以使用不同的消息。
var code = document.querySelector('[type="javascript/worker"]').textContent;
var blob = new Blob([code], { "type": 'application/javascript' });
var blobUrl = URL.createObjectURL(blob);
var counter = new Worker(blobUrl);
var sab;
var initBuffer = function (msg) {
sab = new SharedArrayBuffer(16);
counter.postMessage({
init: true,
msg: msg,
buffer: sab
});
};
var editArray = function () {
var res = new Int32Array(sab);
for (let i = 0; i < 4; i++) {
Atomics.store(res, i, i*2);
}
console.log('Array edited', res);
};
initBuffer('Init buffer and start worker');
counter.onmessage = function(event) {
console.log(event.data.msg);
if (event.data.edit) {
editArray();
// share new buffer with worker
counter.postMessage({buffer: sab});
// end worker
counter.postMessage({end: true});
}
};
<script type="javascript/worker">
var sab;
self['onmessage'] = function(event) {
if (event.data.init) {
postMessage({msg: event.data.msg, edit: true});
}
if (event.data.buffer) {
sab = event.data.buffer;
var sharedArray = new Int32Array(sab);
postMessage({msg: 'Shared Array: '+sharedArray});
}
if (event.data.end) {
postMessage({msg: 'Time to rest'});
}
};
</script>
why does the worker does not start directly after the message was sen[t] (= received?). AFAIK, workers have their own event queue, so they should not rely on the main stack becoming empty?
首先,即使您的 Worker 对象在主线程中同步可用,但在实际的工作线程中,在能够处理您的消息之前还有 a lot of things 要做:
- 它必须 perform a network request 来检索脚本内容。即使使用 blobURI,它也是一个异步操作。
- 它必须初始化整个 js 上下文,所以即使网络请求快如闪电,这也会增加并行执行时间。
- 它必须等待主脚本执行后的事件循环帧来处理您的消息。即使初始化速度快如闪电,它还是要等一段时间。
所以在正常情况下,您的 Worker 在您需要数据时执行您的代码的可能性很小。
现在你谈到了阻塞主线程。
If I try busy waiting after postMessage by reading from the SAB until it changes one value will block the program infinitely
在 the initialization of your Worker, the message are temporarily being kept on the main thread, in what is called the outside port. It's only after the fetching of the script is done that this outside port is entangled with the inside port 期间,消息实际传递到该并行线程。
因此,如果您在端口纠缠之前确实阻塞了主线程,它将无法将其传递给工作线程。
Is there a specification detailing when a worker starts working after sending a message?
Sure, and more specifically, the port message queue is enabled at the step 26, and the Event loop is actually started at the step 29.
Is there a way to make sure the worker has started before accessing the SAB again without using global variables? [...]
当然,让您的 Worker post 在主线程完成时向主线程发送消息。
// some precautions because all browsers still haven't reenabled SharedArrayBuffers
const has_shared_array_buffer = window.SharedArrayBuffer;
function init() {
// since our worker will do only a single operation
// we can Promisify it
// if we were to use it for more than a single task,
// we could promisify each task by using a MessagePort
return new Promise((resolve, reject) => {
const code = `
onmessage = function(event) {
console.log('hi');
var buffer= event.data;
var arr = new Uint32Array(buffer);
arr.fill(255);
if(self.SharedArrayBuffer) {
postMessage("done");
}
else {
postMessage(buffer, [buffer]);
}
}`
let buffer = has_shared_array_buffer ? new SharedArrayBuffer(16) : new ArrayBuffer(16);
const blob = new Blob([code], { "type": 'application/javascript' });
const blobUrl = URL.createObjectURL(blob);
const counter = new Worker(blobUrl);
counter.onmessage = e => {
if(!has_shared_array_buffer) {
buffer = e.data;
}
const res = new Uint32Array(buffer);
resolve(res);
};
counter.onerror = reject;
if(has_shared_array_buffer) {
counter.postMessage(buffer);
}
else {
counter.postMessage(buffer, [buffer]);
}
});
};
async function test (){
let array = await init();
//accessing the SAB again
console.log(array);
};
test().catch(console.error);
我有一个 Worker 与“主线程”共享一个 SharedArrayBuffer。为了正常工作,我必须确保工作人员在主线程访问 SAB 之前可以访问它。 (编辑:创建工作人员的代码必须在一个单独的函数中(编辑 2:returns 一个指向 SAB 的数组。)(也许,这已经不可能了,你会告诉我)。
初始代码如下所示:
function init() {
var code = `onmessage = function(event) {
console.log('starting');
var buffer=event.data;
var arr = new Uint32Array(buffer);// I need to have this done before accessing the buffer again from the main
//some other code, manipulating the array
}`
var buffer = new SharedArrayBuffer(BUFFER_ELEMENT_SIZE);
var blob = new Blob([code], { "type": 'application/javascript' });
var url = window.URL || window.webkitURL;
var blobUrl = url.createObjectURL(blob);
var counter = new Worker(blobUrl);
counter.postMessage(buffer);
let res = new Uint32Array(buffer);
return res;
}
function test (){
let array = init();
console.log('main');
//accessing the SAB again
};
worker代码总是在test()
之后执行,控制台总是显示main
,然后starting
.
使用超时没有帮助。考虑 test
的以下代码:
function test (){
let array = [];
console.log('main');
setTimeout(function(){
array = initSAB();
},0);
setTimeout(function(){
console.log('main');
//accessing the SAB again
},0);
console.log('end');
};
控制台先显示end
,然后是main
,然后是starting
。
但是,即使没有超时,将缓冲区分配给 test() 函数外部的全局数组也能完成这项工作。
我的问题如下:
- 为什么在消息发送(= 收到?)后 worker 没有直接启动。据我所知,工人有自己的事件队列,所以他们不应该依赖于主堆栈变空?
- 是否有详细说明工作人员在发送消息后何时开始工作?
- 有没有办法在不使用全局变量的情况下确保worker在再次访问SAB之前已经启动? (可以使用busy waiting,但我要小心...)可能没有办法,但我想确定。
编辑
更准确地说:
- 在完全并行的 运行 场景中,Worker 将能够 posted 后立即处理消息。这显然是 不是这样的。
- 大多数浏览器API(Worker 就是这样一个API)使用回调队列来处理对API 的调用。但如果这适用,消息将是 posted/handled 在执行超时回调之前。
- 更进一步:如果我尝试在post消息之后忙着等待,通过从 SAB 读取消息直到它更改一个值 将阻止 无限编程。对我来说,这意味着 浏览器 可以 不post发送消息直到调用栈为空至于 我知道,这种行为没有记录,我无法解释。
总结一下:我想知道如果 postMessage 的调用在内部,浏览器如何确定何时 post 消息并由 worker 处理它一个函数。 我已经找到了一个解决方法(全局变量),所以我更感兴趣的是它在幕后是如何工作的。但如果有人能给我一个工作示例,我会接受它。
编辑 2:
使用全局变量的代码(运行良好的代码)如下所示
function init() {
//Unchanged
}
var array = init(); //global
function test (){
console.log('main');
//accessing the SAB again
};
它打印 starting
,然后 main
到控制台。
同样值得注意的是:如果我用 Firefox 浏览器调试代码(Chrome 未测试)我得到我想要的结果没有全局变量(starting
在main
之前)有人可以解释一下吗?
根据 MDN:
Data passed between the main page and workers is copied, not shared. Objects are serialized as they're handed to the worker, and subsequently, de-serialized on the other end. The page and worker do not share the same instance, so the end result is that a duplicate is created on each end. Most browsers implement this feature as structured cloning.
阅读更多关于 transferring data to and from workers
这是与工作人员共享缓冲区的基本代码。它创建一个偶数 (i*2)
的数组,并将其发送给工作人员。它使用 Atomic operations 来更改缓冲区值。
要确保工作器已启动,您可以使用不同的消息。
var code = document.querySelector('[type="javascript/worker"]').textContent;
var blob = new Blob([code], { "type": 'application/javascript' });
var blobUrl = URL.createObjectURL(blob);
var counter = new Worker(blobUrl);
var sab;
var initBuffer = function (msg) {
sab = new SharedArrayBuffer(16);
counter.postMessage({
init: true,
msg: msg,
buffer: sab
});
};
var editArray = function () {
var res = new Int32Array(sab);
for (let i = 0; i < 4; i++) {
Atomics.store(res, i, i*2);
}
console.log('Array edited', res);
};
initBuffer('Init buffer and start worker');
counter.onmessage = function(event) {
console.log(event.data.msg);
if (event.data.edit) {
editArray();
// share new buffer with worker
counter.postMessage({buffer: sab});
// end worker
counter.postMessage({end: true});
}
};
<script type="javascript/worker">
var sab;
self['onmessage'] = function(event) {
if (event.data.init) {
postMessage({msg: event.data.msg, edit: true});
}
if (event.data.buffer) {
sab = event.data.buffer;
var sharedArray = new Int32Array(sab);
postMessage({msg: 'Shared Array: '+sharedArray});
}
if (event.data.end) {
postMessage({msg: 'Time to rest'});
}
};
</script>
why does the worker does not start directly after the message was sen[t] (= received?). AFAIK, workers have their own event queue, so they should not rely on the main stack becoming empty?
首先,即使您的 Worker 对象在主线程中同步可用,但在实际的工作线程中,在能够处理您的消息之前还有 a lot of things 要做:
- 它必须 perform a network request 来检索脚本内容。即使使用 blobURI,它也是一个异步操作。
- 它必须初始化整个 js 上下文,所以即使网络请求快如闪电,这也会增加并行执行时间。
- 它必须等待主脚本执行后的事件循环帧来处理您的消息。即使初始化速度快如闪电,它还是要等一段时间。
所以在正常情况下,您的 Worker 在您需要数据时执行您的代码的可能性很小。
现在你谈到了阻塞主线程。
If I try busy waiting after postMessage by reading from the SAB until it changes one value will block the program infinitely
在 the initialization of your Worker, the message are temporarily being kept on the main thread, in what is called the outside port. It's only after the fetching of the script is done that this outside port is entangled with the inside port 期间,消息实际传递到该并行线程。
因此,如果您在端口纠缠之前确实阻塞了主线程,它将无法将其传递给工作线程。
Is there a specification detailing when a worker starts working after sending a message?
Sure, and more specifically, the port message queue is enabled at the step 26, and the Event loop is actually started at the step 29.
Is there a way to make sure the worker has started before accessing the SAB again without using global variables? [...]
当然,让您的 Worker post 在主线程完成时向主线程发送消息。
// some precautions because all browsers still haven't reenabled SharedArrayBuffers
const has_shared_array_buffer = window.SharedArrayBuffer;
function init() {
// since our worker will do only a single operation
// we can Promisify it
// if we were to use it for more than a single task,
// we could promisify each task by using a MessagePort
return new Promise((resolve, reject) => {
const code = `
onmessage = function(event) {
console.log('hi');
var buffer= event.data;
var arr = new Uint32Array(buffer);
arr.fill(255);
if(self.SharedArrayBuffer) {
postMessage("done");
}
else {
postMessage(buffer, [buffer]);
}
}`
let buffer = has_shared_array_buffer ? new SharedArrayBuffer(16) : new ArrayBuffer(16);
const blob = new Blob([code], { "type": 'application/javascript' });
const blobUrl = URL.createObjectURL(blob);
const counter = new Worker(blobUrl);
counter.onmessage = e => {
if(!has_shared_array_buffer) {
buffer = e.data;
}
const res = new Uint32Array(buffer);
resolve(res);
};
counter.onerror = reject;
if(has_shared_array_buffer) {
counter.postMessage(buffer);
}
else {
counter.postMessage(buffer, [buffer]);
}
});
};
async function test (){
let array = await init();
//accessing the SAB again
console.log(array);
};
test().catch(console.error);