Python

Question

我正在处理文本文件“creatures.txt”。其内容示例如下所示：

Special Type A Sunflower
2016-10-12 18:10:40
Asteraceae
Ingredient in Sunflower Oil
Brought to North America by Europeans
Requires fertile and moist soil
Full sun

Pine Tree
2018-12-15 13:30:45
Pinaceae
Evergreen
Tall and long-lived
Temperate climate

Tropical Sealion
2019-01-20 12:10:05
Otariidae
Found in zoos
Likes fish
Likes balls
Likes zookeepers

Big Honey Badger
2020-06-06 10:10:25
Mustelidae
Eats anything
King of the desert

当其内容转换为字典输入的值时，运行很好。
输入

def TextFileToDictionary():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
        return dataset                          
TextFileToDictionary()

输出

[{'Name': 'Special Type A Sunflower',
  'Date': '2016-10-12 18:10:40',
  'Information': ['Asteraceae',
   'Ingredient in Sunflower Oil',
   'Brought to North America by Europeans',
   'Requires fertile and moist soil',
   'Full sun']},
 {'Name': 'Pine Tree',
  'Date': '2018-12-15 13:30:45',
  'Information': ['Pinaceae',
   'Evergreen',
   'Tall and long-lived',
   'Temperate climate']},
 {'Name': 'Tropical Sealion',
  'Date': '2019-01-20 12:10:05',
  'Information': ['Otariidae',
   'Found in zoos',
   'Likes fish',
   'Likes balls',
   'Likes zookeepers']},
 {'Name': 'Big Honey Badger',
  'Date': '2020-06-06 10:10:25',
  'Information': ['Mustelidae', 'Eats anything', 'King of the desert']}]

正如观察到的那样，输出包含多个没有名称的词典。

目前，我正在尝试创建将按
1) 第一个键值按字母顺序排序和
2) 第二个键值按最新日期排序的函数。

我的进度是：

import itertools
import os

MyFilePath = os.getcwd() 
ActualFile = "creatures.txt"
FinalFilePath = os.path.join(MyFilePath, ActualFile) 

def TextFileToDictionaryName():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
            dataset.sort(key=lambda x: x[0]['Name'], reverse=False)
        return dataset                          
TextFileToDictionaryName()

def TextFileToDictionaryDate():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
            dataset.sort(key=lambda x: x[1]['Date'], reverse=True)
        return dataset                          
TextFileToDictionaryDate()

但是，我遇到了错误“KeyError: 0”。我不确定如何解决它。
我也不确定是否将字典输出转换回字符串格式，就像之前“creatures.txt”文件的内容一样。

有人知道如何修复代码吗？

非常感谢！

Answer 1

不要使用字典。您的数据似乎有相应的模型。

相反，创建一个合适的 Python class，一个 Creature:

class Creature:
    __init__(self, name, date, habitat):
        self.name = name
        self.date = date
        self.habitat = habitat
        # etc.

在您读取输入文件时，为每个数据分组创建新的 Creature 实例。将每个 Creature 添加到某种 collection 中：

creatures = list()
with open(FinalFilePath, "r") as textfile:  
    sections = textfile.read().split("\n\n")
    for section in sections:                 
        lines = section.split("\n")      
        creatures.append(Creature(lines[0], lines[1])) # add more params?

接下来，add some boiler-plate methods (__lt__, etc.) 到您的 Creature class，这样它就可以排序了。

最后，只需使用sorted(creatures)，然后您的collection个生物将根据您的__lt__逻辑排序。

__lt__ 的实现如下所示：

def __lt__(self, other):
    if self.name < other.name:
        return True
    elif self.name > other.name:
        return False
    elif self.date < other.date:
        return True
    elif self.date > other.date:
        return False
    else
        # What happens if name and date are the same?

** 或者，您可以使用 creatures = SortedList()，然后当您调用 creates.add(Creature(...)) 时，每个项目都会插入到正确的位置。你不需要在最后调用 sorted(creatures)。

Answer 2

你就快到了，只是不要做 x[0] 或 x[1]。另外，我认为您不应该在循环中的每次迭代时都对列表进行排序，而应该只在结束时对列表进行排序。

def TextFileToDictionaryName():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
        dataset.sort(key=lambda x: x['Name'], reverse=False)
        return dataset                          
TextFileToDictionaryName()

def TextFileToDictionaryDate():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
        dataset.sort(key=lambda x: x['Date'], reverse=True)
        return dataset                          
TextFileToDictionaryDate()

Answer 3

您不需要先按名称再按日期分别对列表进行排序。你可以同时做。

得到KeyError的原因： key 参数用于指定在进行比较之前要在每个列表元素上调用的函数。元素 x 将是一个字典而不是一个列表，所以我希望你使用 x[0] 的原因是你假设 x 是一个列表但它不是。

from datetime import datetime

sample = [
    {
        "Name": "Special Type A Sunflower",
        "Date": "2016-10-12 18:10:40",
        "Information": [...],
    },
    {
        "Name": "Pine Tree",
        "Date": "2018-12-15 13:30:45",
        "Information": [...],
    },
    {
        "Name": "Tropical Sealion",
        "Date": "2019-01-20 12:10:05",
        "Information": [...],
    },
    {
        "Name": "Big Honey Badger",
        "Date": "2020-06-06 10:10:25",
        "Information": [...],
    },
]

sample.sort(
    key=lambda x: (x["Name"], datetime.strptime(x["Date"], "%Y-%m-%d %H:%M:%S"))
)

Python - 如何按第一个和第二个键值对字典进行排序？

Python - How to sort dictionaries by first and second key values?

string

dictionary

key-value

text-files