在 python django 中解析 csv 文件
Parsing a csv file in python django
我正在尝试从已上传的 csv 文件中读取数据。
首先,我获取每一行,然后尝试通过用逗号分隔来从每一行中读取数据,这对于理想情况很有用,但如果包含“,”(如地址字段),它将以错误的格式解析数据。
我想要一个更可靠的解决方案 val = v.split(',')
我的密码是
upload_file = request.FILES['upload_file']
data = [row for row in csv.reader(upload_file.read().splitlines())]
for v in data:
# v is every row
val = v.split(',') #spliting value of every row to get each record of every row
如果您想拆分并访问每行中的单词,那么 re.split
将是一个不错的选择:
re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']
如果您使用简单的读取语句读入文件,例如:
data = upload_file.read()
# you can use the re library --> import re
rows = re.split('\n', data) # splits along new line
for index, row in enumerate(rows):
cells = row.split(',')
# do whatever you want with the cells in each row
# this also keeps an index if you want the cell's row index
或者您可以使用 csv.reader
模块:
file_reader = csv.reader(upload_file, delimiter=',')
for row in file_reader:
# do something with row data.
print(row)
# would print the rows like
# I, like, to, ride, my, bicycle
# I, like, to, ride, my, bike
CSV 表示逗号分隔值。如果您需要在 CSV 中对字符串进行编码,您通常会用引号将其括起来。否则您将无法正确解析文件:
$ cat sample.csv
"131, v17",foo
bar,qux
>>> import csv
>>> with open('sample.csv', 'rb') as f:
... r = csv.reader(f)
... for row in r:
... print row
...
['131, v17', 'foo']
['bar', 'qux']
当然,如果您省略引号,第一行将被解析为 3 个字段。
您可以使用 pandas. Here's an example based on this question:
>>> import sys, pandas
>>> if sys.version_info[0] < 3:
from StringIO import StringIO
else:
from io import StringIO
## I assume your input is something like this:
>>> string = "a,b,c\n1,2,3\n4,5,6\n7,8,9\n"
>>> stringIO = StringIO(string)
>>> df = pandas.DataFrame.from_csv(stringIO, sep=',', index_col=False)
>>> print df
a b c
0 1 2 3
1 4 5 6
2 7 8 9
>>> print df.columns
Index([u'a', u'b', u'c'], dtype='object')
## access elements
>>> print df['a'][3]
7
我正在尝试从已上传的 csv 文件中读取数据。 首先,我获取每一行,然后尝试通过用逗号分隔来从每一行中读取数据,这对于理想情况很有用,但如果包含“,”(如地址字段),它将以错误的格式解析数据。
我想要一个更可靠的解决方案 val = v.split(',')
我的密码是
upload_file = request.FILES['upload_file']
data = [row for row in csv.reader(upload_file.read().splitlines())]
for v in data:
# v is every row
val = v.split(',') #spliting value of every row to get each record of every row
如果您想拆分并访问每行中的单词,那么 re.split
将是一个不错的选择:
re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']
如果您使用简单的读取语句读入文件,例如:
data = upload_file.read()
# you can use the re library --> import re
rows = re.split('\n', data) # splits along new line
for index, row in enumerate(rows):
cells = row.split(',')
# do whatever you want with the cells in each row
# this also keeps an index if you want the cell's row index
或者您可以使用 csv.reader
模块:
file_reader = csv.reader(upload_file, delimiter=',')
for row in file_reader:
# do something with row data.
print(row)
# would print the rows like
# I, like, to, ride, my, bicycle
# I, like, to, ride, my, bike
CSV 表示逗号分隔值。如果您需要在 CSV 中对字符串进行编码,您通常会用引号将其括起来。否则您将无法正确解析文件:
$ cat sample.csv
"131, v17",foo
bar,qux
>>> import csv
>>> with open('sample.csv', 'rb') as f:
... r = csv.reader(f)
... for row in r:
... print row
...
['131, v17', 'foo']
['bar', 'qux']
当然,如果您省略引号,第一行将被解析为 3 个字段。
您可以使用 pandas. Here's an example based on this question:
>>> import sys, pandas
>>> if sys.version_info[0] < 3:
from StringIO import StringIO
else:
from io import StringIO
## I assume your input is something like this:
>>> string = "a,b,c\n1,2,3\n4,5,6\n7,8,9\n"
>>> stringIO = StringIO(string)
>>> df = pandas.DataFrame.from_csv(stringIO, sep=',', index_col=False)
>>> print df
a b c
0 1 2 3
1 4 5 6
2 7 8 9
>>> print df.columns
Index([u'a', u'b', u'c'], dtype='object')
## access elements
>>> print df['a'][3]
7