Python 2.7 替换csv文件中一列的全部值的代码
Python 2.7 code for Replacing whole value of a column in a csv file
我有一个从 SQL 服务器导入的数据,在一个 csv 文件中 headers。
我想在 python2.7 ** 中编写一个 **代码,它可以读取一个 csv 文件并将其 re-write 写入新的 csv 文件,其中我们用正则表达式 'SECRET VALUE'.
CSV 示例输入:
ID,Name,city,SSN,CreditCardNo
1,Joy,London,123-465-456,123456789087645
2,Sam,NewYork,765-465-457,98765434567345
3,Jhon,Paris,678-365-654,765654542345677
4,Eric,Delhi,456-888-999,123456789087645
预期样本输出:
ID,Name,city,SSN,CreditCardNo
1,Joy,London,SECRET VALUE,SECRET VALUE
2,Sam,NewYork,SECRET VALUE,SECRET VALUE
3,Jhon,Paris,SECRET VALUE,SECRET VALUE
4,Eric,Delhi,SECRET VALUE,SECRET VALUE
我的尝试:
import sys
import csv
r = csv.reader(open('C:\Users\Praveen\workspace\sampleFiles\test1.csv'))
lines = [l for l in r]
lines[2][2] = '30'
writer = csv.writer(open('C:\Users\Praveen\workspace\sampleFiles\test4.csv', 'wb'))
writer.writerows(lines)
这只改变了一个元素,我想屏蔽整列。
我觉得你需要read_csv
first, then replace values with iloc
and last write to file by DataFrame.to_csv
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""ID,Name,city,SSN,CreditCardNo
1,Joy,London,123-465-456,123456789087645
2,Sam,NewYork,765-465-457,98765434567345
3,Jhon,Paris,678-365-654,765654542345677
4,Eric,Delhi,456-888-999,123456789087645"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp))
print df
ID Name city SSN CreditCardNo
0 1 Joy London 123-465-456 123456789087645
1 2 Sam NewYork 765-465-457 98765434567345
2 3 Jhon Paris 678-365-654 765654542345677
3 4 Eric Delhi 456-888-999 123456789087645
df.iloc[:, -2:] = 'SECRET VALUE'
print df
ID Name city SSN CreditCardNo
0 1 Joy London SECRET VALUE SECRET VALUE
1 2 Sam NewYork SECRET VALUE SECRET VALUE
2 3 Jhon Paris SECRET VALUE SECRET VALUE
3 4 Eric Delhi SECRET VALUE SECRET VALUE
df.to_csv('file.csv', index=False)
我有一个从 SQL 服务器导入的数据,在一个 csv 文件中 headers。
我想在 python2.7 ** 中编写一个 **代码,它可以读取一个 csv 文件并将其 re-write 写入新的 csv 文件,其中我们用正则表达式 'SECRET VALUE'.
CSV 示例输入:
ID,Name,city,SSN,CreditCardNo
1,Joy,London,123-465-456,123456789087645
2,Sam,NewYork,765-465-457,98765434567345
3,Jhon,Paris,678-365-654,765654542345677
4,Eric,Delhi,456-888-999,123456789087645
预期样本输出:
ID,Name,city,SSN,CreditCardNo
1,Joy,London,SECRET VALUE,SECRET VALUE
2,Sam,NewYork,SECRET VALUE,SECRET VALUE
3,Jhon,Paris,SECRET VALUE,SECRET VALUE
4,Eric,Delhi,SECRET VALUE,SECRET VALUE
我的尝试:
import sys
import csv
r = csv.reader(open('C:\Users\Praveen\workspace\sampleFiles\test1.csv'))
lines = [l for l in r]
lines[2][2] = '30'
writer = csv.writer(open('C:\Users\Praveen\workspace\sampleFiles\test4.csv', 'wb'))
writer.writerows(lines)
这只改变了一个元素,我想屏蔽整列。
我觉得你需要read_csv
first, then replace values with iloc
and last write to file by DataFrame.to_csv
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""ID,Name,city,SSN,CreditCardNo
1,Joy,London,123-465-456,123456789087645
2,Sam,NewYork,765-465-457,98765434567345
3,Jhon,Paris,678-365-654,765654542345677
4,Eric,Delhi,456-888-999,123456789087645"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp))
print df
ID Name city SSN CreditCardNo
0 1 Joy London 123-465-456 123456789087645
1 2 Sam NewYork 765-465-457 98765434567345
2 3 Jhon Paris 678-365-654 765654542345677
3 4 Eric Delhi 456-888-999 123456789087645
df.iloc[:, -2:] = 'SECRET VALUE'
print df
ID Name city SSN CreditCardNo
0 1 Joy London SECRET VALUE SECRET VALUE
1 2 Sam NewYork SECRET VALUE SECRET VALUE
2 3 Jhon Paris SECRET VALUE SECRET VALUE
3 4 Eric Delhi SECRET VALUE SECRET VALUE
df.to_csv('file.csv', index=False)