Pandas DataFrame : ValueError: Length of values (13) does not match length of index (12)

Question

我试图运行来自 Pycharm 和 Jupyter Notebook 的以下代码。在 Jupyter 中，错误没有发生，而 Pycharm 却发生了。有人可以帮助解决问题吗？

下面是news_collection.csv

的数据集可视化

created_at,text
5/13/2021 3:27:55 PM,"Srilanka team is well prepared for the worldCup 2021"
5/13/2021 3:27:55 PM,"They will be missing Lasith Malinga for sure"

下面是给出上述错误的代码

import pandas as pd

def aggregated():
    tweets = pd.read_csv(r'news_collection.csv')
    df = pd.DataFrame(tweets, columns=['created_at', 'text'])
    df['created_at'] = pd.to_datetime(df['created_at'])
    df['text'] = df['text'].apply(lambda x: str(x))
    pd.set_option('display.max_colwidth', 0)
    df = df.groupby(pd.Grouper(key='created_at', freq='1D')).agg(lambda x: ' 
    '.join(set(x)))
    return df


   
if __name__ == '__main__':
    print(aggregated())
    aggregated().to_csv(r'preprocessed_tweets_aggregated.csv',index = True, 
    header=True)

Answer 1

引发错误的问题是由于 Pycharm 中的 pandas 包中的版本问题。我在 Jupyter 上运行使用 pandas 1.1.5 版本的相同代码，而在 Pycharm 运行 pandas 1.3.0 中无法正常工作。
所以要在 Pycharm 中更改软件包版本，您可以按照以下步骤操作（在我的例子中，我不得不将 pandas 版本降级到 1.1.5)

Step 01 - Goto your project in Pycharm and Select the options as below

Step 02 - Then You will direct to "Python Interpreter" tab -> Select the Package You want to Change(Pandas in my case) -> Double click on the Version -> Select the Specify Version check box -> Give the version you want to upgrade or downgrade -> Select Install Package

Pandas DataFrame : ValueError: Length of values (13) does not match length of index (12)

Pandas DataFrame : ValueError: Length of values (13) does not match length of index (12)

python

dataframe

pycharm

pandas

jupyter-notebook