从 csv 文件中提取法语字符并使用它们更新要素类 (ArcGIS 10.4 & Python 2.7.10)
Pulling French characters from a csv file and updating a featureclass with them (ArcGIS 10.4 & Python 2.7.10)
我将在下面 post 我的代码。我一直在尝试创建一个自动更新脚本以在文件地理数据库中创建一个城市地址要素类,并且该脚本除了最后一步之外按预期运行:我正在尝试添加一个包含街道名称和街道标题(街道、道路、街道等)基于不同字段中的 "before or after" 标志(1 用于街道名称之前,2 用于之后),但我似乎遇到了 Unicode 错误。我对 python 比较陌生,所以我不太熟悉使用不同的 Unicode 设置。我试过包括:
# -*- coding: utf-8 -*-
作为代码的第一行,但无济于事。我收到的错误如下:
回溯(最后一次调用):
文件 "P:\AFT\Sept2018\CADB_update_working\CADB_update\CADB_updateScript_test_complete_a_test_a.py",第 252 行,位于
对于游标中的行:
UnicodeDecodeError:'utf8' 编解码器无法解码位置 3 中的字节 0xe9:数据意外结束
我完全希望这是我没有发现的一些简单的拼写错误或语法错误,或者可能是我生成的 csv 中的一些缺陷,它是从 txt 文件生成的。有问题的代码部分如下:
# uses txt file to write a csv file
txtFile = inFolder + "\street_types.txt"
csvFile = inFolder + "\street_types.csv"
with open(txtFile, 'rb') as inFile, open(csvFile, 'wb') as outFile:
in_txt = csv.reader(inFile, delimiter = '\t')
out_csv = csv.writer(outFile)
out_csv.writerows(in_txt)
print "CSV created"
# writes two columns of the csv into 2 lists and then combines them into a dictionary
with open(csvFile,'r') as csvFile:
reader = csv.reader(csvFile, delimiter=',')
next(reader, None)
listA = [] # CD
listB = [] # Display Before Flag
listC = [] # NAME
for row in reader:
listA.append(row[0])
listB.append(row[3])
listC.append(row[1])
# print listA
# print listB
keys = map(int, listA)
values = map(int, listB)
dictionary = dict(zip(keys,values))
print dictionary
keysB = map(int, listA)
valuesB = listC
dictionaryB = dict(zip(keysB,valuesB))
print dictionaryB
# uses that dictionary to update the field just added to teh feature class with the corresponding boolean value
print "Dictionaries made successfully"
update_fields = ["ST_TYPE_CD","ST_NAME_AFTER_TYPE"]
with arcpy.da.UpdateCursor(fc, update_fields) as cursor:
for row in cursor:
if row[0] in dictionary:
row[1] = dictionary[row[0]]
cursor.updateRow(row)
# Adding more fields to hold the concatenated ST_TYPE_CD and STREET_NAME based on ST_NAME_AFTER_TYPE
field_name = "ST_NAME_COMPLETE"
if arcpy.ListFields(fc, field_name):
print "Field to be added already exists"
else:
arcpy.AddField_management(fc, "ST_NAME_COMPLETE", "TEXT")
print "Field added"
field_name = "ST_TYPE"
if arcpy.ListFields(fc, field_name):
print "Field to be added already exists"
else:
arcpy.AddField_management(fc, "ST_TYPE", "TEXT")
print "Field added"
# Populating those added fields
fields = ["ST_TYPE_CD","ST_TYPE","STREET_NAME"]
where = "STREET_NAME IS NOT NULL"
with arcpy.da.UpdateCursor(fc, fields, where) as cursor:
for row in cursor:
if row[0] in dictionaryB:
row[1] = dictionaryB[row[0]]
cursor.updateRow(row)
print "One of two field transcriptions complete"
fields = ["ST_TYPE","STREET_NAME","ST_NAME_COMPLETE","ST_NAME_AFTER_TYPE"]
with arcpy.da.UpdateCursor(fc, fields, where) as cursor:
for row in cursor:
if row[3] == 1:
row[2] = row[0] + " " + row[1]
elif row[3] == 2:
row[2] = row[1] + " " + row[0]
cursor.updateRow(row)
print "Two of two field transcriptions complete"
如果 csv 文件可能是问题所在,我可以尝试上传该文件或显示包含的数据片段。
我已经坚持了一段时间,所以任何帮助或建议将不胜感激。
正如上面评论的那样,问题的解决方法是更改
listC.append(row[1])
至
listC.append(row[1].decode('cp1252'))
这将列表值从字符串转换为 Unicode 字符串(例如 u'string'),这允许随后的进程正确解释 unicode 字符。
我将在下面 post 我的代码。我一直在尝试创建一个自动更新脚本以在文件地理数据库中创建一个城市地址要素类,并且该脚本除了最后一步之外按预期运行:我正在尝试添加一个包含街道名称和街道标题(街道、道路、街道等)基于不同字段中的 "before or after" 标志(1 用于街道名称之前,2 用于之后),但我似乎遇到了 Unicode 错误。我对 python 比较陌生,所以我不太熟悉使用不同的 Unicode 设置。我试过包括:
# -*- coding: utf-8 -*-
作为代码的第一行,但无济于事。我收到的错误如下:
回溯(最后一次调用): 文件 "P:\AFT\Sept2018\CADB_update_working\CADB_update\CADB_updateScript_test_complete_a_test_a.py",第 252 行,位于 对于游标中的行: UnicodeDecodeError:'utf8' 编解码器无法解码位置 3 中的字节 0xe9:数据意外结束
我完全希望这是我没有发现的一些简单的拼写错误或语法错误,或者可能是我生成的 csv 中的一些缺陷,它是从 txt 文件生成的。有问题的代码部分如下:
# uses txt file to write a csv file
txtFile = inFolder + "\street_types.txt"
csvFile = inFolder + "\street_types.csv"
with open(txtFile, 'rb') as inFile, open(csvFile, 'wb') as outFile:
in_txt = csv.reader(inFile, delimiter = '\t')
out_csv = csv.writer(outFile)
out_csv.writerows(in_txt)
print "CSV created"
# writes two columns of the csv into 2 lists and then combines them into a dictionary
with open(csvFile,'r') as csvFile:
reader = csv.reader(csvFile, delimiter=',')
next(reader, None)
listA = [] # CD
listB = [] # Display Before Flag
listC = [] # NAME
for row in reader:
listA.append(row[0])
listB.append(row[3])
listC.append(row[1])
# print listA
# print listB
keys = map(int, listA)
values = map(int, listB)
dictionary = dict(zip(keys,values))
print dictionary
keysB = map(int, listA)
valuesB = listC
dictionaryB = dict(zip(keysB,valuesB))
print dictionaryB
# uses that dictionary to update the field just added to teh feature class with the corresponding boolean value
print "Dictionaries made successfully"
update_fields = ["ST_TYPE_CD","ST_NAME_AFTER_TYPE"]
with arcpy.da.UpdateCursor(fc, update_fields) as cursor:
for row in cursor:
if row[0] in dictionary:
row[1] = dictionary[row[0]]
cursor.updateRow(row)
# Adding more fields to hold the concatenated ST_TYPE_CD and STREET_NAME based on ST_NAME_AFTER_TYPE
field_name = "ST_NAME_COMPLETE"
if arcpy.ListFields(fc, field_name):
print "Field to be added already exists"
else:
arcpy.AddField_management(fc, "ST_NAME_COMPLETE", "TEXT")
print "Field added"
field_name = "ST_TYPE"
if arcpy.ListFields(fc, field_name):
print "Field to be added already exists"
else:
arcpy.AddField_management(fc, "ST_TYPE", "TEXT")
print "Field added"
# Populating those added fields
fields = ["ST_TYPE_CD","ST_TYPE","STREET_NAME"]
where = "STREET_NAME IS NOT NULL"
with arcpy.da.UpdateCursor(fc, fields, where) as cursor:
for row in cursor:
if row[0] in dictionaryB:
row[1] = dictionaryB[row[0]]
cursor.updateRow(row)
print "One of two field transcriptions complete"
fields = ["ST_TYPE","STREET_NAME","ST_NAME_COMPLETE","ST_NAME_AFTER_TYPE"]
with arcpy.da.UpdateCursor(fc, fields, where) as cursor:
for row in cursor:
if row[3] == 1:
row[2] = row[0] + " " + row[1]
elif row[3] == 2:
row[2] = row[1] + " " + row[0]
cursor.updateRow(row)
print "Two of two field transcriptions complete"
如果 csv 文件可能是问题所在,我可以尝试上传该文件或显示包含的数据片段。
我已经坚持了一段时间,所以任何帮助或建议将不胜感激。
正如上面评论的那样,问题的解决方法是更改
listC.append(row[1])
至
listC.append(row[1].decode('cp1252'))
这将列表值从字符串转换为 Unicode 字符串(例如 u'string'),这允许随后的进程正确解释 unicode 字符。