无法在 Heroku 上使用 APScheduler 更新应用程序中的 GeoJSON 文件

Question

我的应用程序中有 2 个 GeoJSON 文件。我已经使用 APScheduler 编写了一个 Python 作业，以根据数据库中的更改更新 2 个 GeoJSON 文件。该作业配置为每 24 小时运行一次。目前，我收到了新 GeoJSON 文件已创建的确认消息，但它在打印此日志语句后立即崩溃。我不确定我们是否可以写入 Heroku 容器，这是作业崩溃的原因吗？

我有什么替代方案可以让它发挥作用？我要尝试的其中一件事是将 APScheduler 的输出写入 Amazon S3。在这方面的任何建议都会有很大帮助。

我有另一项工作是更新数据库中的几个字段，工作正常。

此外，这在本地工作正常。它根据数据库中的变化替换现有的 GeoJSON。

from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.schedulers.background import BackgroundScheduler

import psycopg2
from UnoCPI import sqlfiles
import os

import Project_GEOJSON,Partner_GEOJSON

sched = BlockingScheduler()
sched1 = BackgroundScheduler()
# Initializing the sql files
sql = sqlfiles

# Schedules job_function to be run on the third Friday
# of June, July, August, November and December at 00:00, 01:00, 02:00 and 03:00
# sched.add_job(YOURRUNCTIONNAME, 'cron', month='6-8,11-12', day='3rd fri', hour='0-3')


@sched.scheduled_job('cron', day_of_week='mon-sun', hour=23)
# @sched.scheduled_job('cron', month='1,6,8', day='1', hour='0')
# @sched.scheduled_job('interval', minutes=5)
@sched1.add_job(generateGEOJSON,'cron', day_of_week='mon-sun', hour=20)


def generateGEOJSON():
    os.system(Partner_GEOJSON)
    os.system(Project_GEOJSON)

def scheduled_job():
    print('This job is ran every day at 11pm.')
    # print('This job is ran every 1st day of the month of January, June and August at 12 AM.')
    # print('This job is ran every minute.')

    global connection
    global cursor

    try:
        # CAT STAGING
        connection = psycopg2.connect(user="heroku cred",
                                      password="postgres password from heroku",
                                      host="heroku host",
                                      port="5432",
                                      database="heroku db",
                                      sslmode="require")



        if connection:
            print("Postgres SQL Database successful connection")

        cursor = connection.cursor()

        # create a temp table with all projects start and end dates
        cursor.execute(sql.start_and_end_dates_temp_table_sql)

        # fetch all community partners to be set to inactive
        cursor.execute(sql.comm_partners_to_be_set_to_inactive)

        inactive_comm_partners = cursor.fetchall()
        print("Here is the list of all projects to be set to inactive", "\n")
        # loop to print all the data
        for i in inactive_comm_partners:
            print(i)

        # fetch all community partners to be set to active
        cursor.execute(sql.comm_partners_to_be_set_to_active)

        active_comm_partners = cursor.fetchall()
        print("Here is the list of all projects to be set to active", "\n")
        # loop to print all the data
        for i in active_comm_partners:
            print(i)

        # UPDATE PROJECT STATUS TO ACTIVE
        cursor.execute(sql.update_project_to_active_sql)

        # UPDATE PROJECT STATUS TO COMPLETED
        cursor.execute(sql.update_project_to_inactive_sql)

        # UPDATE COMMUNITY PARTNER WHEN TIED TO A INACTIVE PROJECTS ONLY TO FALSE(INACTIVE)
        cursor.execute(sql.update_comm_partner_to_inactive_sql)

        # UPDATE  COMMUNITY PARTNER WHEN TIED TO A BOTH ACTIVE
        # and / or INACTIVE or JUST ACTIVE PROJECTS ONLY TO TRUE(ACTIVE)
        cursor.execute(sql.update_comm_partner_to_active_sql)

        # drop all_projects_start_and_end_date temp table
        cursor.execute(sql.drop_temp_table_all_projects_start_and_end_dates_sql)

    except (Exception, psycopg2.Error) as error:
        print("Error while connecting to Postgres SQL", error)
    finally:
        # closing database connection.
        if connection:
            connection.commit()
            cursor.close()
            connection.close()
            print("Postgres SQL connection is closed")


sched.start()
sched1.start()

Answer 1

I am not sure if we can write into the Heroku container

可以，但您的更改会定期丢失。 Heroku 的文件系统 is dyno-local and ephemeral. Every time your dyno restarts, changes made to the filesystem will be lost. This happens frequently (at least once per day) 并且不可预测。

One of the things that I would be trying is to write the output of APScheduler to Amazon S3

这正是 Heroku recommends doing with generated files and user uploads:

AWS Simple Storage Service, e.g. S3, is a “highly durable and available store” and can be used to reliably store application content such as media files, static assets and user uploads. It allows you to offload your entire storage infrastructure and offers better scalability, reliability, and speed than just storing files on the filesystem.

AWS S3, or similar storage services, are important when architecting applications for scale and are a perfect complement to Heroku's ephemeral filesystem.

无法在 Heroku 上使用 APScheduler 更新应用程序中的 GeoJSON 文件

Unable to update GeoJSON files in an application using APScheduler on Heroku

heroku

apscheduler

python-3.6

django-2.1