转换文件时如何避免重复代码?

How to avoid repeat code when converting files?

假设我有一个class,其功能是转换文件类型:

@dataclass
class Converter:
    data: Union[str, pd.DataFrame]

    def to_pickle(self):
        """Check the type of data. 

        Returns:
            a pickle file if the table exists, a string otherwise.
        """
        if isinstance(self.data, pd.DataFrame):
            return pd.to_pickle(self.data, "table_data.pkl")
        else:
            return self.data
        
    def to_csv(self):
        """Check the type of data. 

        Returns:
            a csv file if the table exists, a string otherwise.
        """
        if isinstance(self.data, pd.DataFrame):
            return pd.to_csv(
                self.data, "./table_data.csv", 
                index=False, 
                encoding="utf_8_sig",)
        else:
            return self.data

因为这两种方法都会先检查数据类型。如果数据是数据框,两种方法都将应用 pd.to_csv 和 pd.to_pickle。否则,将返回一个字符串。

似乎 to_csv() 和 to_pickle() 之间的唯一区别是转换类型(即 pd.to_csv 和 pd.to_pickle)。有什么办法可以不重复代码吗?

您可以编写一个辅助函数/方法来进行检查并进行实际转换,而其他方法只需调用该方法即可:

@dataclass
class Converter:
    data: Union[str, pd.DataFrame]

    def _converter_helper(self, conversion_type: str):
        if isinstance(self.data, pd.DataFrame):
            if conversion_type == 'pickle':
                return pd.to_pickle(self.data, "table_data.pkl")
            elif conversion_type == 'csv':
                return pd.to_csv(
                    self.data, "./table_data.csv", 
                    index=False, 
                    encoding="utf_8_sig",)
            else:
                raise ValueError('conversion_type must be either "csv" or "pickle"')
        else:
            return self.data

    def to_pickle(self):
        """Check the type of data. 

        Returns:
            a pickle file if the table exists, a string otherwise.
        """
        return self._converter_helper('pickle')
        
    def to_csv(self):
        """Check the type of data. 

        Returns:
            a csv file if the table exists, a string otherwise.
        """
        return self._converter_helper('csv')