如何在 Python 中将对象 A 的数据合并到对象 B?

How to merge data from object A into object B in Python?

我想知道是否有一种程序化的方法可以在不手动设置的情况下将数据从对象 A 合并到对象 B。

例如,我有以下 pydantic 模型,它表示 API 调用电影数据库的结果:

class PersonScraperReply(BaseModel):
    """Represents a Person Scraper Reply"""

    scraper_name: str
    """Name of the scraper used to scrape this data"""

    local_person_id: int
    """Id of person in local database"""

    local_person_name: str
    """name of person in local database"""

    aliases: Optional[list[str]] = None
    """list of strings that represent the person's aliases obtained from scraper"""

    description: Optional[str] = None
    """String description of the person obtained from scraper"""

    date_of_birth: Optional[date] = None
    """Date of birth of the person obtained from scraper"""

    date_of_death: Optional[date] = None
    """Date the person passed away obtained from scraper"""

    gender: Optional[GenderEnum] = None
    """Gender of the person obtained from scraper"""

    homepage: Optional[str] = None
    """Person's official homepage obtained from scraper"""

    place_of_birth: Optional[str] = None
    """Location where the person wsa born obtained from scraper"""

    profile_image_url: Optional[str] = None
    """Url for person's profile image obtained from scraper"""

    additional_images: Optional[list[str]] = None
    """List of urls for additional images for the person obtained from scraper"""

    scrape_status: ScrapeStatus
    """status of scraping. Success or failure"""

我也有这个 SQLAlchemy class 在我的数据库中代表一个人:

class PersonInDatabase(Base):

    id: int
    """Person Id"""

    name: str
    """Person Name"""
    
    description: str = Column(String)
    """Description of the person"""

    gender: GenderEnum = Column(Enum(GenderEnum), nullable=False, default=GenderEnum.unspecified)
    """Person's gender, 0=unspecified, 1=male, 2=female, 3=non-binary"""

    tmdb_id: int = Column(Integer)
    """Tmdb id"""

    imdb_id: str = Column(String)
    """IMDB id, in the format of nn[alphanumeric id]"""

    place_of_birth: str = Column(String)
    """Place of person's birth"""

    # dates
    date_of_birth: DateTime = Column(DateTime)
    """Date the person was born"""

    date_of_death: DateTime = Column(DateTime)
    """Date the person passed away"""

    date_last_person_scrape: DateTime = Column(DateTime)
    """Date last time the person was scraped"""

我的目标是将从 API 调用中收到的数据合并到数据库对象中。当我说合并时,我的意思是分配存在于两个对象中的字段并且对其余部分不做任何事情。大致如下:

person_scrape_reply = PersonScraperReply()
person_in_db = PersonInDatabase()


for field_in_API_name, field_in_API_value in person_scrape_reply.fields: #for field in API response
    if field_in_API_name in person_in_db.field_names and field_in_API_value is not None: #if field exists in PersonInDatabase and the value is not none
        person_in_db.fields[field_in_API_name] = field_in_API_value #assign API response value to field in database class.

这样的事情可能吗?

使用 attrs 包。

from attrs import define, asdict

@define
class PersonScraperReply(BaseModel):
    """Represents a Person Scraper Reply"""

    scraper_name: str
    """Name of the scraper used to scrape this data"""

    local_person_id: int
    """Id of person in local database"""

    local_person_name: str
    """name of person in local database"""

    aliases: Optional[list[str]] = None
    """list of strings that represent the person's aliases obtained from scraper"""

    description: Optional[str] = None
    """String description of the person obtained from scraper"""

    date_of_birth: Optional[date] = None
    """Date of birth of the person obtained from scraper"""

    date_of_death: Optional[date] = None
    """Date the person passed away obtained from scraper"""

    gender: Optional[GenderEnum] = None
    """Gender of the person obtained from scraper"""

    homepage: Optional[str] = None
    """Person's official homepage obtained from scraper"""

    place_of_birth: Optional[str] = None
    """Location where the person wsa born obtained from scraper"""

    profile_image_url: Optional[str] = None
    """Url for person's profile image obtained from scraper"""

    additional_images: Optional[list[str]] = None
    """List of urls for additional images for the person obtained from scraper"""

    scrape_status: ScrapeStatus
    """status of scraping. Success or failure"""

@define
class PersonInDatabase(Base):

    id: int
    """Person Id"""

    name: str
    """Person Name"""
    
    description: str = Column(String)
    """Description of the person"""

    gender: GenderEnum = Column(Enum(GenderEnum), nullable=False, default=GenderEnum.unspecified)
    """Person's gender, 0=unspecified, 1=male, 2=female, 3=non-binary"""

    tmdb_id: int = Column(Integer)
    """Tmdb id"""

    imdb_id: str = Column(String)
    """IMDB id, in the format of nn[alphanumeric id]"""

    place_of_birth: str = Column(String)
    """Place of person's birth"""

    # dates
    date_of_birth: DateTime = Column(DateTime)
    """Date the person was born"""

    date_of_death: DateTime = Column(DateTime)
    """Date the person passed away"""

    date_last_person_scrape: DateTime = Column(DateTime)
    """Date last time the person was scraped"""


person_scrape_reply = PersonScraperReply()
person_in_db = PersonInDatabase()
scrape_asdict = asdict(person_scrape_reply)
db_asdict = asdict(person_in_db)

for field_in_API_name, field_in_API_value in scrape_asdict.items(): #for field in API response
    if field_in_API_name in db_asdict.keys() and field_in_API_value is not None: #if field exists in PersonInDatabase and the value is not none
        setattr(person_in_db, field_in_API_name, field_in_API_value) #assign API response value to field in database class.

@Daniel 建议的方法(使用 attrs)对我来说是一个错误,我确定它适用于常规 类 但它会导致 SQLAlchemy 和 Pydantic 都出错 类.

在使用调试器后,我看到 Pydantic 和 SQLAchemy 都提供了一种以字符串格式访问其字段名称的方法。在 SQLAchemy 中它是 inspect([SQLALCHEMY MAPPED CLASS]).attrs.key 而 Pydantic 只是有一个内置的 dict() 方法。当 pydantic 的一大卖点是它可以将数据 类 序列化为 JSON.

时,我忘记了它有点傻

无论如何,这两种方法对我有用:

def assing_empty(person_to_assign: Person, scrape_result: PersonScraperReply):
    blacklisted_fields = ["aliases"] #fields to ignore
    person_to_assign_fields = [x.key for x in inspect(person_to_assign).attrs] #SQLAlchemy fields
    scrape_result_fields = [x for x in scrape_result.dict().keys() if x not in blacklisted_fields] #Pydantic fields

    for field_name in scrape_result_fields:
        if field_name in person_to_assign_fields:
            person_to_assign_value = getattr(person_to_assign, field_name)
            scrape_result_value = getattr(scrape_result, field_name)

            if scrape_result_value is not None and person_to_assign_value is None:
                setattr(person_to_assign, field_name, scrape_result_value)