dagster 主进程的工作目录是否与调度程序进程不同

Is the working directory of the dagster main process different of the scheduler processes

我在从 dagster 代码(设置,而不是管道)加载文件时遇到问题。假设我有以下项目结构:

pipelines
-app/
--environments
----schedules.yaml
--repository.py
--repository.yaml

当我在项目文件夹 ($cd project && dagit -y app/repository.yaml) 中 运行 时,这个文件夹成为工作目录并且在 repository.py 中我可以加载一个文件,知道根目录是 project

# repository.py

with open('app/evironments/schedules.yaml', 'r'):
   # do something with the file

但是,如果我设置了计划,项目中的管道不会 运行。检查 cron 日志,似乎 open 行抛出文件未找到异常。我想知道是否会发生这种情况,因为执行 cron 时工作目录不同。

对于上下文,我正在为每个管道加载一个参数为 cron_schedules 的配置文件。另外,在我的案例中,这是堆栈跟踪的尾部:

  File "/home/user/.local/share/virtualenvs/pipelines-mfP13m0c/lib/python3.8/site-packages/dagster/core/definitions/handle.py", line 190, in from_yaml
    return LoaderEntrypoint.from_file_target(
  File "/home/user/.local/share/virtualenvs/pipelines-mfP13m0c/lib/python3.8/site-packages/dagster/core/definitions/handle.py", line 161, in from_file_target
    module = import_module_from_path(module_name, os.path.abspath(python_file))
  File "/home/user/.local/share/virtualenvs/pipelines-mfP13m0c/lib/python3.8/site-packages/dagster/seven/__init__.py", line 75, in import_module_from_path
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/user/pipelines/app/repository.py", line 28, in <module>
    schedule_builder = ScheduleBuilder(settings.CRON_PRESET, settings.ENV_DICT)
  File "/home/user/pipelines/app/schedules.py", line 12, in __init__
    self.cron_schedules = self._load_schedules_yaml()
  File "/home/user/pipelines/app/schedules.py", line 16, in _load_schedules_yaml
    with open(path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'app/environments/schedules.yaml'

您可以使用文件的绝对路径打开文件,以便正确打开。

from dagster.utils import file_relative_path

with open(file_relative_path(__file__, './environments/schedules.yaml'), 'r'):
   # do something with the file

所有 file_relative_path 只是简单地执行以下操作,因此如果您愿意,可以直接调用 os.path 方法:

def file_relative_path(dunderfile, relative_path):
    os.path.join(os.path.dirname(dunderfile), relative_path)