将字典中的所有内容都转换为小写,然后对其进行过滤?
Convert everything in a dictionary to lower case, then filter on it?
import pandas as pd
import nltk
import os
directory = os.listdir(r"C:\...")
x = []
num = 0
for i in directory:
x.append(pd.read_fwf("C:\..." + i))
x[num] = x[num].to_string()
所以,一旦我的目录中的每个文件都有一个字典 x = [ ] 由 read_fwf 填充:
我想知道如何让每个字符都是小写的。我无法理解语法及其如何应用于字典。
我想定义一个过滤器,我可以用它来计算这个新定义的词典中的单词列表,例如,
list = [公共汽车、汽车、火车、飞机、电车……]
编辑:快速无关的问题:
pd_read_fwf 是阅读 .txt 文件的最佳方式吗?如果没有,我还能用什么?
非常感谢任何帮助。谢谢
编辑 2:我想要的示例数据和输出:
样本:
The Horncastle boar's head is an early seventh-century Anglo-Saxon
ornament depicting a boar that probably was once part of the crest of
a helmet. It was discovered in 2002 by a metal detectorist searching
in the town of Horncastle, Lincolnshire. It was reported as found
treasure and acquired for £15,000 by the City and County Museum, where
it is on permanent display.
所需输出 - 将大写的所有内容更改为小写:
the horncastle boar's head is an early seventh-century anglo-saxon
ornament depicting a boar that probably was once part of the crest of
a helmet. it was discovered in 2002 by a metal detectorist searching
in the town of horncastle, lincolnshire. it was reported as found
treasure and acquired for £15,000 by the city and county museum, where
it is on permanent display.
我想你要找的是字典理解:
# Python 3
new_dict = {key: val.lower() for key, val in old_dict.items()}
# Python 2
new_dict = {key: val.lower() for key, val in old_dict.iteritems()}
items()
/iteritems()
为您提供字典中表示的 (keys, values)
的元组列表(例如 [('somekey', 'SomeValue'), ('somekey2', 'SomeValue2')]
)
推导式遍历每一对,在此过程中创建一个新字典。在 key: val.lower()
部分,你可以做任何你想创建新字典的操作。
您根本不需要使用 pandas 或字典。只需使用 Python 的内置 open()
函数:
# Open a file in read mode with a context manager
with open(r'C:\path\to\you\file.txt', 'r') as file:
# Read the file into a string
text = file.read()
# Use the string's lower() method to make everything lowercase
text = text.lower()
print(text)
# Split text by whitespace into list of words
word_list = text.split()
# Get the number of elements in the list (the word count)
word_count = len(word_list)
print(word_count)
如果你愿意,可以按相反的顺序进行:
# Open a file in read mode with a context manager
with open(r'C:\path\to\you\file.txt', 'r') as file:
# Read the file into a string
text = file.read()
# Split text by whitespace into list of words
word_list = text.split()
# Use list comprehension to create a new list with the lower() method applied to each word.
lowercase_word_list = [word.lower() for word in word_list]
print(word_list)
为此使用上下文管理器很好,因为它会在文件超出范围时自动为您关闭文件(从 with
语句块中取消制表符)。否则你将不得不使用 file.open()
和 file.read()
.
我认为使用上下文管理器还有其他一些好处,但如果我错了,请有人纠正我。
import pandas as pd
import nltk
import os
directory = os.listdir(r"C:\...")
x = []
num = 0
for i in directory:
x.append(pd.read_fwf("C:\..." + i))
x[num] = x[num].to_string()
所以,一旦我的目录中的每个文件都有一个字典 x = [ ] 由 read_fwf 填充:
我想知道如何让每个字符都是小写的。我无法理解语法及其如何应用于字典。
我想定义一个过滤器,我可以用它来计算这个新定义的词典中的单词列表,例如,
list = [公共汽车、汽车、火车、飞机、电车……]
编辑:快速无关的问题:
pd_read_fwf 是阅读 .txt 文件的最佳方式吗?如果没有,我还能用什么?
非常感谢任何帮助。谢谢
编辑 2:我想要的示例数据和输出:
样本:
The Horncastle boar's head is an early seventh-century Anglo-Saxon ornament depicting a boar that probably was once part of the crest of a helmet. It was discovered in 2002 by a metal detectorist searching in the town of Horncastle, Lincolnshire. It was reported as found treasure and acquired for £15,000 by the City and County Museum, where it is on permanent display.
所需输出 - 将大写的所有内容更改为小写:
the horncastle boar's head is an early seventh-century anglo-saxon ornament depicting a boar that probably was once part of the crest of a helmet. it was discovered in 2002 by a metal detectorist searching in the town of horncastle, lincolnshire. it was reported as found treasure and acquired for £15,000 by the city and county museum, where it is on permanent display.
我想你要找的是字典理解:
# Python 3
new_dict = {key: val.lower() for key, val in old_dict.items()}
# Python 2
new_dict = {key: val.lower() for key, val in old_dict.iteritems()}
items()
/iteritems()
为您提供字典中表示的 (keys, values)
的元组列表(例如 [('somekey', 'SomeValue'), ('somekey2', 'SomeValue2')]
)
推导式遍历每一对,在此过程中创建一个新字典。在 key: val.lower()
部分,你可以做任何你想创建新字典的操作。
您根本不需要使用 pandas 或字典。只需使用 Python 的内置 open()
函数:
# Open a file in read mode with a context manager
with open(r'C:\path\to\you\file.txt', 'r') as file:
# Read the file into a string
text = file.read()
# Use the string's lower() method to make everything lowercase
text = text.lower()
print(text)
# Split text by whitespace into list of words
word_list = text.split()
# Get the number of elements in the list (the word count)
word_count = len(word_list)
print(word_count)
如果你愿意,可以按相反的顺序进行:
# Open a file in read mode with a context manager
with open(r'C:\path\to\you\file.txt', 'r') as file:
# Read the file into a string
text = file.read()
# Split text by whitespace into list of words
word_list = text.split()
# Use list comprehension to create a new list with the lower() method applied to each word.
lowercase_word_list = [word.lower() for word in word_list]
print(word_list)
为此使用上下文管理器很好,因为它会在文件超出范围时自动为您关闭文件(从 with
语句块中取消制表符)。否则你将不得不使用 file.open()
和 file.read()
.
我认为使用上下文管理器还有其他一些好处,但如果我错了,请有人纠正我。