从大型 csv 文件创建时间序列图

Creating time series plot from large csv file

我有一个包含 3 个月温度数据的大型 csv 文件。第 1 列是月份,第 2 列是日期,第 3 列是小时,第 4 列是分钟,第 5 列是温度。我试图用 x 轴上仅列出的 day/month 绘制所有 3 个月的温度数据。这是我目前所拥有的:

filename ='TemperatureFile.csv'
with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader)

    month, day, hour, temperature = [], [], [], []
    for row in reader:
        mon=datetime.strptime(row[0],"%m")
        month.append(mon)
        date=datetime.strptime(row[1],"%d")
        day.append(date)
        hourx=datetime.strptime(row[2],"%H:%M")
        hour.append(hourx)
        temp = float(row[3])
        temperature.append(temp)

        datefinal=date.isoformat(month, day)


    #date.isoformat(months, days).isoformat()==    

    fig = plt.figure(dpi=128, figsize=(10,6))
    plt.plot(datefinal,temperature, c='red')

    plt.title("Temperatures", fontsize=20)
    plt.xlabel('Date', fontsize=16)
    plt.ylabel('Temperatures(F)', fontsize=16)

    plt.show()

我无法弄清楚如何组合所有 month/day/hour/minute 信息以便绘制所有数据,而且我无法弄清楚如何将 month/day 放在 x 轴上。

好吧,如果您确实需要从零件创建日期时间,您可以这样做:

dtms = []
temps = []

for row in reader:
    # split row into its 4 parts
    month, day, hour, temperature = row

    # create base datetime object with the current date/time
    dt = datetime.now()

    # replace current month
    dt = dt.replace(month=int(month))

    # replace current day
    dt = dt.replace(day=int(day))

    ## or, you could replace both at the same time
    #dt = dt.replace(month=int(month), day=int(day))

    # get date object from datetime
    dt = dt.date()

    # get time
    tm = datetime.strptime(hour, "%H:%M")

    # get time object from datetime
    tm = tm.time()

    # combine date and time
    dtm = datetime(dt, tm)

    # add datetime to list
    dtms.append(dtm)

    # add temperature to list
    temps.append(float(temperature))

但是,您的情况似乎没有必要。实际上,一次只创建一个日期时间对象要容易得多,而不是创建基础日期时间对象并添加新部分:

dtms = []
temps = []

for row in reader:
    # split row into its 4 parts
    month, day, hour, temperature = row

    # concatenate columns into string
    # assuming the current year is the correct year, you may need to add additional logic to get the correct year
    # also assumes dateparts are properly zero-padded
    dtm = "{}-{}-{} {}".format(datetime.now().year, month, day, hour)

    # convert created string into datetime object
    dtm = datetime.strptime(dtm, "%Y-%m-%d %H:%M")

    # add datetime to list
    dtms.append(dtm)

    # add temperature to list
    temps.append(float(temperature))

为了使您提供的代码正常工作,我不得不花很多时间。

然后我可以从您拥有的 3 列中创建一个日期。

最后我把重点放在情节的格式上。

真的是 2 个问题。

filename ='TemperatureFile.csv'

with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader)

    dates_list, temperature = [], []
    for row in reader:
        #Convert your 3 colums of strings into a sinlge datestring.
        datestring = ("{0:}/{1:} {2:}".format(*row))
        # Then convert that into a complete date object
        date_obj = datetime.datetime.strptime(datestring, "%m/%d %H:%M")
        # Get the temperature as before
        temp = float(row[3])
        # Add new date and temp to appropriate lists
        temperature.append(temp)
        dates_list.append(date_obj)

# Now have complete lists of date and temp.
# Next focus on the formatting of the plot.
myFmt = mdates.DateFormatter('%m/%d') # set the format to print the dates in.
months = mdates.MonthLocator()  # every month
days = mdates.DayLocator()   # every Day

fig = plt.figure(dpi=128, figsize=(10,6))
ax = fig.add_axes([0.1, 0.2, 0.85, 0.75])
plt.plot(dates_list,temperature, c='red')

# format the ticks
ax.xaxis.set_major_formatter(myFmt)    
ax.xaxis.set_major_locator(months)
ax.xaxis.set_minor_locator(days)

plt.title("Temperatures", fontsize=20)
plt.xlabel('Date', fontsize=16)
plt.ylabel('Temperatures(F)', fontsize=16)

plt.show()

有关如何设置绘图格式的更多信息,google 或在 SO 中搜索您想要执行的操作以及 matplotlib 一词。这就是我的示例来源。