从文本文件填充字典?
Populating a dictionary from a text file?
所以我有一个看起来像这样的文本文件:
Monstera Deliciosa
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water
Strelitzia Nicolai (White Birds of Paradise)
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot
Alocasia Macrorrhizos
2019-01-03 15:29:10
Tropical asia
Moist and well-draining soil
Leaves and stem toxic upon ingestion
Semi-shade, full sun
Like lots of water, less susceptible to root rot
Susceptible to spider mites
我想从这个文件中创建一个字典,其中植物的名称作为字典的键,其余信息作为值放入列表中。到目前为止,我已经设法将每种植物及其各自的信息作为列表中的一个项目获取,但我不确定如何将其转换为字典。
with open('myplants.txt', 'r') as f:
contents = f.read()
contents = contents.rstrip().split('\n\n')
contents = [x.replace('\n', ', ') for x in contents]
print(contents)#[0].split(',',0)[0])
预期输出:
plants = {'Monstera Deliciosa':['2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil', 'Semi-shade/full shade light requirements', 'Water only when top 2 inches of soil is dry', 'Intolerant to root rot', 'Propagate by cuttings in water'], 'Strelitzia Nicolai (White Birds of Paradise)': ... }
我愿意接受更好的字典格式。
使用字典理解:
text = """Monstera Deliciosa
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water
Strelitzia Nicolai (White Birds of Paradise)
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot
Alocasia Macrorrhizos
2019-01-03 15:29:10
Tropical asia
Moist and well-draining soil
Leaves and stem toxic upon ingestion
Semi-shade, full sun
Like lots of water, less susceptible to root rot
Susceptible to spider mites
"""
contents = text.rstrip().split('\n\n')
contents = [x.replace('\n', ', ') for x in contents]
plants = {c.split(',')[0]: c.split(',')[1:]
for c in contents}
print(plants)
返回:
{'Monstera Deliciosa': [' 2018-11-03 18:21:26', ' Tropical/sub-Tropical plant', ' Leathery leaves', ' mid to dark green', ' Moist and well-draining soil', ' Semi-shade/full shade light requirements', ' Water only when top 2 inches of soil is dry', ' Intolerant to root rot', ' Propagate by cuttings in water'], 'Strelitzia Nicolai (White Birds of Paradise)': [' 2018-11-05 10:12:15', ' Semi-shade', ' full sun', ' Dark green leathery leaves', ' Like lots of water', 'but soil cannot be water-logged', ' Like to be root bound in pot'], 'Alocasia Macrorrhizos': [' 2019-01-03 15:29:10', ' Tropical asia', ' Moist and well-draining soil', ' Leaves and stem toxic upon ingestion', ' Semi-shade', ' full sun', ' Like lots of water', ' less susceptible to root rot', ' Susceptible to spider mites']}
这样的东西行得通吗?
plants = {}
with open('myplants.txt', 'r') as f:
contents = f.read()
contents = contents.rstrip().split('\n\n')
for content in contents:
parts = content.split('\n') # Convert the lines to a list of strings
plants[ parts[0] ] = parts[1:] # first line becomes key, the rest become the values
print(plants)
这是一种使用状态解析数据的方法:
def parse(lines):
items = []
state = "name"
for line in lines:
line = line.rstrip("\n")
if line == "":
state = "name"
continue
if state == "name":
item = {"name": line, "date": None, "data": []}
items.append(item)
state = "date"
continue
if state == "date":
item["date"] = line
state = "data"
continue
if state == "data":
item["data"].append(line)
continue
return items
这导致:
[{'name': 'Monstera Deliciosa',
'date': '2018-11-03 18:21:26',
'data': ['Tropical/sub-Tropical plant',
'Leathery leaves, mid to dark green',
'Moist and well-draining soil',
'Semi-shade/full shade light requirements',
'Water only when top 2 inches of soil is dry',
'Intolerant to root rot',
'Propagate by cuttings in water']},
{'name': 'Strelitzia Nicolai (White Birds of Paradise)',
'date': '2018-11-05 10:12:15',
'data': ['Semi-shade, full sun',
'Dark green leathery leaves',
'Like lots of water,but soil cannot be water-logged',
'Like to be root bound in pot']},
{'name': 'Alocasia Macrorrhizos',
'date': '2019-01-03 15:29:10',
'data': ['Tropical asia',
'Moist and well-draining soil',
'Leaves and stem toxic upon ingestion',
'Semi-shade, full sun',
'Like lots of water, less susceptible to root rot',
'Susceptible to spider mites']}]
我认为这种替代表示更易于使用。
这是一个可扩展的解决方案,可避免读取内存中的整个文件。
它利用了文本文件可以用作生成每一行的迭代器这一事实
import itertools as it
plants = {}
with open('myplants.txt') as f:
while True:
try:
p = next(f).rstrip()
plants[p] = list(l.rstrip() for l in it.takewhile(lambda line: line != '\n', f))
except StopIteration:
break
print(plants)
生产
{
'Monstera Deliciosa': ['2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil', 'Semi-shade/full shade light requirements', 'Water only when top 2 inches of soil is dry', 'Intolerant to root rot', 'Propagate by cuttings in water'],
'Strelitzia Nicolai (White Birds of Paradise)': ['2018-11-05 10:12:15', 'Semi-shade, full sun', 'Dark green leathery leaves', 'Like lots of water,but soil cannot be water-logged', 'Like to be root bound in pot'],
'Alocasia Macrorrhizos': ['2019-01-03 15:29:10', 'Tropical asia', 'Moist and well-draining soil', 'Leaves and stem toxic upon ingestion', 'Semi-shade, full sun', 'Like lots of water, less susceptible to root rot', 'Susceptible to spider mites']
}
所以我有一个看起来像这样的文本文件:
Monstera Deliciosa
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water
Strelitzia Nicolai (White Birds of Paradise)
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot
Alocasia Macrorrhizos
2019-01-03 15:29:10
Tropical asia
Moist and well-draining soil
Leaves and stem toxic upon ingestion
Semi-shade, full sun
Like lots of water, less susceptible to root rot
Susceptible to spider mites
我想从这个文件中创建一个字典,其中植物的名称作为字典的键,其余信息作为值放入列表中。到目前为止,我已经设法将每种植物及其各自的信息作为列表中的一个项目获取,但我不确定如何将其转换为字典。
with open('myplants.txt', 'r') as f:
contents = f.read()
contents = contents.rstrip().split('\n\n')
contents = [x.replace('\n', ', ') for x in contents]
print(contents)#[0].split(',',0)[0])
预期输出:
plants = {'Monstera Deliciosa':['2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil', 'Semi-shade/full shade light requirements', 'Water only when top 2 inches of soil is dry', 'Intolerant to root rot', 'Propagate by cuttings in water'], 'Strelitzia Nicolai (White Birds of Paradise)': ... }
我愿意接受更好的字典格式。
使用字典理解:
text = """Monstera Deliciosa
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water
Strelitzia Nicolai (White Birds of Paradise)
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot
Alocasia Macrorrhizos
2019-01-03 15:29:10
Tropical asia
Moist and well-draining soil
Leaves and stem toxic upon ingestion
Semi-shade, full sun
Like lots of water, less susceptible to root rot
Susceptible to spider mites
"""
contents = text.rstrip().split('\n\n')
contents = [x.replace('\n', ', ') for x in contents]
plants = {c.split(',')[0]: c.split(',')[1:]
for c in contents}
print(plants)
返回:
{'Monstera Deliciosa': [' 2018-11-03 18:21:26', ' Tropical/sub-Tropical plant', ' Leathery leaves', ' mid to dark green', ' Moist and well-draining soil', ' Semi-shade/full shade light requirements', ' Water only when top 2 inches of soil is dry', ' Intolerant to root rot', ' Propagate by cuttings in water'], 'Strelitzia Nicolai (White Birds of Paradise)': [' 2018-11-05 10:12:15', ' Semi-shade', ' full sun', ' Dark green leathery leaves', ' Like lots of water', 'but soil cannot be water-logged', ' Like to be root bound in pot'], 'Alocasia Macrorrhizos': [' 2019-01-03 15:29:10', ' Tropical asia', ' Moist and well-draining soil', ' Leaves and stem toxic upon ingestion', ' Semi-shade', ' full sun', ' Like lots of water', ' less susceptible to root rot', ' Susceptible to spider mites']}
这样的东西行得通吗?
plants = {}
with open('myplants.txt', 'r') as f:
contents = f.read()
contents = contents.rstrip().split('\n\n')
for content in contents:
parts = content.split('\n') # Convert the lines to a list of strings
plants[ parts[0] ] = parts[1:] # first line becomes key, the rest become the values
print(plants)
这是一种使用状态解析数据的方法:
def parse(lines):
items = []
state = "name"
for line in lines:
line = line.rstrip("\n")
if line == "":
state = "name"
continue
if state == "name":
item = {"name": line, "date": None, "data": []}
items.append(item)
state = "date"
continue
if state == "date":
item["date"] = line
state = "data"
continue
if state == "data":
item["data"].append(line)
continue
return items
这导致:
[{'name': 'Monstera Deliciosa',
'date': '2018-11-03 18:21:26',
'data': ['Tropical/sub-Tropical plant',
'Leathery leaves, mid to dark green',
'Moist and well-draining soil',
'Semi-shade/full shade light requirements',
'Water only when top 2 inches of soil is dry',
'Intolerant to root rot',
'Propagate by cuttings in water']},
{'name': 'Strelitzia Nicolai (White Birds of Paradise)',
'date': '2018-11-05 10:12:15',
'data': ['Semi-shade, full sun',
'Dark green leathery leaves',
'Like lots of water,but soil cannot be water-logged',
'Like to be root bound in pot']},
{'name': 'Alocasia Macrorrhizos',
'date': '2019-01-03 15:29:10',
'data': ['Tropical asia',
'Moist and well-draining soil',
'Leaves and stem toxic upon ingestion',
'Semi-shade, full sun',
'Like lots of water, less susceptible to root rot',
'Susceptible to spider mites']}]
我认为这种替代表示更易于使用。
这是一个可扩展的解决方案,可避免读取内存中的整个文件。
它利用了文本文件可以用作生成每一行的迭代器这一事实
import itertools as it
plants = {}
with open('myplants.txt') as f:
while True:
try:
p = next(f).rstrip()
plants[p] = list(l.rstrip() for l in it.takewhile(lambda line: line != '\n', f))
except StopIteration:
break
print(plants)
生产
{
'Monstera Deliciosa': ['2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil', 'Semi-shade/full shade light requirements', 'Water only when top 2 inches of soil is dry', 'Intolerant to root rot', 'Propagate by cuttings in water'],
'Strelitzia Nicolai (White Birds of Paradise)': ['2018-11-05 10:12:15', 'Semi-shade, full sun', 'Dark green leathery leaves', 'Like lots of water,but soil cannot be water-logged', 'Like to be root bound in pot'],
'Alocasia Macrorrhizos': ['2019-01-03 15:29:10', 'Tropical asia', 'Moist and well-draining soil', 'Leaves and stem toxic upon ingestion', 'Semi-shade, full sun', 'Like lots of water, less susceptible to root rot', 'Susceptible to spider mites']
}