mlflow 跟踪服务器在指定后端存储 uri 后不启动
mlflow tracking server does not start after specifying backend-store-uri
我运行 mlflow如下:
Dockerfile
包含以下CMD命令
CMD mlflow server \
--host 0.0.0.0 \
--backend-store-uri "${BACKEND_STORE_URI}" \
--default-artifact-root "${DEFAULT_ARTIFACT_ROOT}"
在docker run --rm --name mlflow -p 5000:5000 -e BACKEND_STORE_URI=mssql+pymssql://user:pass@mybackendstoreuri/mlflow mlflow
之后
显示
INFO [alembic.runtime.migration] Context impl MSSQLImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
INFO [alembic.runtime.migration] Context impl MSSQLImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
但是,容器在没有启动服务器的情况下退出。
没有指定backend store uri
,可以看到绑定host相关的日志,容器不存在
如何运行 mlflow 跟踪服务器并使用后端存储 uri?
根本原因是
MLflow UI and client code expects a default experiment with ID 0.
This method uses SQL insert statement to create the default experiment as a hack, since
experiment table uses 'experiment_id' column is a PK and is also set to auto increment.
MySQL and other implementation do not allow value '0' for such cases.
参考:https://github.com/mlflow/mlflow/blob/v1.2.0/mlflow/store/sqlalchemy_store.py#L171
迁移过程中没有报错,所以没有错误显示,静默失败时alembic版本是最新的。
参考:https://github.com/mlflow/mlflow/blob/v1.2.0/mlflow/store/db_migrations/env.py#L71
如果使用与MySQL测试相同的想法(https://github.com/mlflow/mlflow/blob/v1.2.0/mlflow/store/sqlalchemy_store.py#L171),则引发异常 - Cannot insert explicit value for identity column in table 'experiment' when IDENTITY_INSERT is set to OFF.
测试片段:
class TestSqlAlchemyStoreMssqlDb(unittest.TestCase):
"""
Run tests against a MSSQL database
"""
def setUp(self):
db_username = "test"
db_password = "test"
host = "test"
db_name = "TEST_DB"
db_server_url = "mssql+pymssql://%s:%s@%s" % (db_username, db_password, host)
self._engine = sqlalchemy.create_engine(db_server_url)
self._db_url = "%s/%s" % (db_server_url, db_name)
print("Connect to %s" % self._db_url)
def test_store(self):
self.store = SqlAlchemyStore(db_uri=self._db_url, default_artifact_root=ARTIFACT_URI)
如日志所示,使用 postgres 服务器完成迁移。
mlflow_1 | 2019/09/24 09:03:55 INFO mlflow.store.sqlalchemy_store: Creating initial MLflow database tables...
mlflow_1 | 2019/09/24 09:03:55 INFO mlflow.store.db.utils: Updating database tables at postgresql://postgres:postgres@postgres:5432/postgres
mlflow_1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
mlflow_1 | INFO [alembic.runtime.migration] Will assume transactional DDL.
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade -> 451aebb31d03, add metric step
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade 451aebb31d03 -> 90e64c465722, migrate user column to tags
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade 90e64c465722 -> 181f10493468, allow nulls for metric values
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade 181f10493468 -> df50e92ffc5e, Add Experiment Tags Table
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade df50e92ffc5e -> 7ac759974ad8, Update run tags with larger limit
mlflow_1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
mlflow_1 | INFO [alembic.runtime.migration] Will assume transactional DDL.
mlflow_1 | [2019-09-24 09:03:55 +0000] [15] [INFO] Starting gunicorn 19.9.0
mlflow_1 | [2019-09-24 09:03:55 +0000] [15] [INFO] Listening at: http://0.0.0.0:5000 (15)
mlflow_1 | [2019-09-24 09:03:55 +0000] [15] [INFO] Using worker: sync
mlflow_1 | [2019-09-24 09:03:55 +0000] [18] [INFO] Booting worker with pid: 18
mlflow_1 | [2019-09-24 09:03:56 +0000] [22] [INFO] Booting worker with pid: 22
mlflow_1 | [2019-09-24 09:03:56 +0000] [26] [INFO] Booting worker with pid: 26
mlflow_1 | [2019-09-24 09:03:56 +0000] [27] [INFO] Booting worker with pid: 27
我运行 mlflow如下:
Dockerfile
包含以下CMD命令
CMD mlflow server \
--host 0.0.0.0 \
--backend-store-uri "${BACKEND_STORE_URI}" \
--default-artifact-root "${DEFAULT_ARTIFACT_ROOT}"
在docker run --rm --name mlflow -p 5000:5000 -e BACKEND_STORE_URI=mssql+pymssql://user:pass@mybackendstoreuri/mlflow mlflow
显示
INFO [alembic.runtime.migration] Context impl MSSQLImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
INFO [alembic.runtime.migration] Context impl MSSQLImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
但是,容器在没有启动服务器的情况下退出。
没有指定backend store uri
,可以看到绑定host相关的日志,容器不存在
如何运行 mlflow 跟踪服务器并使用后端存储 uri?
根本原因是
MLflow UI and client code expects a default experiment with ID 0.
This method uses SQL insert statement to create the default experiment as a hack, since
experiment table uses 'experiment_id' column is a PK and is also set to auto increment.
MySQL and other implementation do not allow value '0' for such cases.
参考:https://github.com/mlflow/mlflow/blob/v1.2.0/mlflow/store/sqlalchemy_store.py#L171
迁移过程中没有报错,所以没有错误显示,静默失败时alembic版本是最新的。 参考:https://github.com/mlflow/mlflow/blob/v1.2.0/mlflow/store/db_migrations/env.py#L71
如果使用与MySQL测试相同的想法(https://github.com/mlflow/mlflow/blob/v1.2.0/mlflow/store/sqlalchemy_store.py#L171),则引发异常 - Cannot insert explicit value for identity column in table 'experiment' when IDENTITY_INSERT is set to OFF.
测试片段:
class TestSqlAlchemyStoreMssqlDb(unittest.TestCase):
"""
Run tests against a MSSQL database
"""
def setUp(self):
db_username = "test"
db_password = "test"
host = "test"
db_name = "TEST_DB"
db_server_url = "mssql+pymssql://%s:%s@%s" % (db_username, db_password, host)
self._engine = sqlalchemy.create_engine(db_server_url)
self._db_url = "%s/%s" % (db_server_url, db_name)
print("Connect to %s" % self._db_url)
def test_store(self):
self.store = SqlAlchemyStore(db_uri=self._db_url, default_artifact_root=ARTIFACT_URI)
如日志所示,使用 postgres 服务器完成迁移。
mlflow_1 | 2019/09/24 09:03:55 INFO mlflow.store.sqlalchemy_store: Creating initial MLflow database tables...
mlflow_1 | 2019/09/24 09:03:55 INFO mlflow.store.db.utils: Updating database tables at postgresql://postgres:postgres@postgres:5432/postgres
mlflow_1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
mlflow_1 | INFO [alembic.runtime.migration] Will assume transactional DDL.
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade -> 451aebb31d03, add metric step
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade 451aebb31d03 -> 90e64c465722, migrate user column to tags
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade 90e64c465722 -> 181f10493468, allow nulls for metric values
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade 181f10493468 -> df50e92ffc5e, Add Experiment Tags Table
mlflow_1 | INFO [alembic.runtime.migration] Running upgrade df50e92ffc5e -> 7ac759974ad8, Update run tags with larger limit
mlflow_1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
mlflow_1 | INFO [alembic.runtime.migration] Will assume transactional DDL.
mlflow_1 | [2019-09-24 09:03:55 +0000] [15] [INFO] Starting gunicorn 19.9.0
mlflow_1 | [2019-09-24 09:03:55 +0000] [15] [INFO] Listening at: http://0.0.0.0:5000 (15)
mlflow_1 | [2019-09-24 09:03:55 +0000] [15] [INFO] Using worker: sync
mlflow_1 | [2019-09-24 09:03:55 +0000] [18] [INFO] Booting worker with pid: 18
mlflow_1 | [2019-09-24 09:03:56 +0000] [22] [INFO] Booting worker with pid: 22
mlflow_1 | [2019-09-24 09:03:56 +0000] [26] [INFO] Booting worker with pid: 26
mlflow_1 | [2019-09-24 09:03:56 +0000] [27] [INFO] Booting worker with pid: 27