为什么我在解析数据帧时会收到错误消息,而当它是单行时却不会收到错误消息?
Why do I receive an error when I parse through a dataframe but not when it is a single row?
python 的新手。我在 python
中使用 pygeocodio 库
API_KEY = "myapikey"
from geocodio import GeocodioClient
client = GeocodioClient(API_KEY)
addresses = client.geocode("21236 Birchwood Loop, 99567, AK")
addresses.best_match.get("accuracy")
Out[61]: 1
addresses.best_match.get("accuracy_type")
Out[62]: 'rooftop'
但是,如果我想遍历数据框(example.csv):
import pandas as pd
customers = pd.read_csv("example.csv")
for row in customers.iterrows():
addresses = client.geocode(row)
addresses.best_match.get("accuracy")
我收到一个错误:
File "C:\Users\jtharian\AppData\Local\Continuum\anaconda3\lib\site-packages\geocodio\client.py", line 58, in error_response
raise exceptions.GeocodioDataError(response.json()["error"])
GeocodioDataError: Could not geocode address. Postal code or city required.
example.csv 的代表:
21236 Birchwood Loop, 99567, AK
1731 Bragaw St, 99508, AK
300 E Fireweed Ln, 99503, AK
4360 Snider Dr, 99654, AK
1921 W Dimond Blvd 108, 99515, AK
2702 Peger Rd, 99709, AK
1651 College Rd, 99709, AK
898 Ballaine Rd, 99709, AK
23819 Immelman Circle, 99567, AK
9750 W Parks Hwy, 99652, AK
7205 Shorewood Dr, 99645, AK
为什么我会收到此错误消息?
我会使用 apply
和特定的异常等,但现在我猜 while new 只关注有效的方法和错误。但是,当您熟悉 pandas 和 python 时,肯定会研究这些主题。
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html
https://geek-university.com/python/catch-specific-exceptions/
errors, address_list, accuracy_list, accuracy_type_list = [], [], [], []
for index, row in customers.iterrows():
try:
addresses = client.geocode(row.values[0])
accuracy = addresses.best_match.get("accuracy")
accuracy_type = addresses.best_match.get("accuracy_type")
address_list.append(addresses)
accuracy_list.append(accuracy)
accuracy_type_list.append(accuracy_type)
except Exception as e:
address_list.append(None)
accuracy_list.append(None)
accuracy_type_list.append(None)
errors.append(f"failure {e.args[0]} at index {index}")
我在做什么? iterrows
提供索引和行的元组。所以我正在对每一行项目进行地理编码。如果有效,我将结果添加到 address_list。与准确性相同。但是当它失败时,我会在错误列表中添加一条消息,以指示数据帧中发生错误的位置;即索引。但我还需要 address_list 中的占位符,所以我只添加 None。所以现在我可以做
customers['addresses'] = address_list
customers['accuracy'] = accuracy_list
customers['accuracy_type'] = accuracy_type_list
并在需要时保存我的数据框。 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
查看 api docs,您需要一个字符串来表示各个地址组件列中的地址,如下所示:
location = client.geocode("1109 N Highland St, Arlington VA")
因此,要在您的 df
中获得这样的列,您可以将每个向量映射到一个字符串,然后使用简单的字符串连接生成一个字符串,然后将该字符串插入到您的 df
:
import pandas as pd
customers = pd.read_csv("example.csv", header=None)
customers['address_string'] = customers[0].map(str) + ' ' + customers[1].map(str) + customers[2].map(str)
制作中:
# >>> customers['address_string']
# 0 21236 Birchwood Loop 99567 AK
# 1 1731 Bragaw St 99508 AK
# 2 300 E Fireweed Ln 99503 AK
# 3 4360 Snider Dr 99654 AK
# 4 1921 W Dimond Blvd 108 99515 AK
然后您可以迭代地址字符串系列的值并将精度存储在一个列表中,该列表可以插入您的 df
:
geocoded_acuracy = []
geocoded_acuracy_type = []
for address in customers['address_string'].values:
geocoded_address = client.geocode(address)
accuracy = geocoded_address.best_match.get("accuracy")
accuracy_type = geocoded_address.best_match.get("accuracy_type")
geocoded_acuracy.append(accuracy)
geocoded_acuracy_type.append(accuracy_type)
customers['accuracy'] = geocoded_acuracy
customers['accuracy_type'] = geocoded_acuracy_type
results = customers[['address_string', 'accuracy', 'accuracy_type']]
结果 df
将如下所示:
# >>> results
# address_string accuracy accuracy_type
# 0 21236 Birchwood Loop 99567 AK 1.00 rooftop
# 1 1731 Bragaw St 99508 AK 1.00 rooftop
# 2 300 E Fireweed Ln 99503 AK 1.00 rooftop
# 3 4360 Snider Dr 99654 AK 1.00 range_interpolation
# 4 1921 W Dimond Blvd 108 99515 AK 1.00 rooftop
# 5 2702 Peger Rd 99709 AK 1.00 rooftop
# 6 1651 College Rd 99709 AK 1.00 rooftop
# 7 898 Ballaine Rd 99709 AK 1.00 rooftop
# 8 23819 Immelman Circle 99567 AK 1.00 rooftop
# 9 9750 W Parks Hwy 99652 AK 0.33 place
# 10 7205 Shorewood Dr 99645 AK 1.00 range_interpolation
然后将结果df
写入一个.csv
:
results.to_csv('results.csv')
将所有这些放在一起产生以下代码:
import pandas as pd
from geocodio import GeocodioClient
API_KEY = 'insert_your_key_here'
client = GeocodioClient(API_KEY)
customers = pd.read_csv("example.csv", header=None)
customers['address_string'] = customers[0].map(str) + ' ' + customers[1].map(str) + customers[2].map(str)
geocoded_acuracy = []
geocoded_acuracy_type = []
for address in customers['address_string'].values:
geocoded_address = client.geocode(address)
accuracy = geocoded_address.best_match.get("accuracy")
accuracy_type = geocoded_address.best_match.get("accuracy_type")
geocoded_acuracy.append(accuracy)
geocoded_acuracy_type.append(accuracy_type)
customers['accuracy'] = geocoded_acuracy
customers['accuracy_type'] = geocoded_acuracy_type
results = customers[['address_string', 'accuracy', 'accuracy_type']]
results.to_csv('results.csv')
python 的新手。我在 python
中使用 pygeocodio 库API_KEY = "myapikey"
from geocodio import GeocodioClient
client = GeocodioClient(API_KEY)
addresses = client.geocode("21236 Birchwood Loop, 99567, AK")
addresses.best_match.get("accuracy")
Out[61]: 1
addresses.best_match.get("accuracy_type")
Out[62]: 'rooftop'
但是,如果我想遍历数据框(example.csv):
import pandas as pd
customers = pd.read_csv("example.csv")
for row in customers.iterrows():
addresses = client.geocode(row)
addresses.best_match.get("accuracy")
我收到一个错误:
File "C:\Users\jtharian\AppData\Local\Continuum\anaconda3\lib\site-packages\geocodio\client.py", line 58, in error_response
raise exceptions.GeocodioDataError(response.json()["error"])
GeocodioDataError: Could not geocode address. Postal code or city required.
example.csv 的代表:
21236 Birchwood Loop, 99567, AK
1731 Bragaw St, 99508, AK
300 E Fireweed Ln, 99503, AK
4360 Snider Dr, 99654, AK
1921 W Dimond Blvd 108, 99515, AK
2702 Peger Rd, 99709, AK
1651 College Rd, 99709, AK
898 Ballaine Rd, 99709, AK
23819 Immelman Circle, 99567, AK
9750 W Parks Hwy, 99652, AK
7205 Shorewood Dr, 99645, AK
为什么我会收到此错误消息?
我会使用 apply
和特定的异常等,但现在我猜 while new 只关注有效的方法和错误。但是,当您熟悉 pandas 和 python 时,肯定会研究这些主题。
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html https://geek-university.com/python/catch-specific-exceptions/
errors, address_list, accuracy_list, accuracy_type_list = [], [], [], []
for index, row in customers.iterrows():
try:
addresses = client.geocode(row.values[0])
accuracy = addresses.best_match.get("accuracy")
accuracy_type = addresses.best_match.get("accuracy_type")
address_list.append(addresses)
accuracy_list.append(accuracy)
accuracy_type_list.append(accuracy_type)
except Exception as e:
address_list.append(None)
accuracy_list.append(None)
accuracy_type_list.append(None)
errors.append(f"failure {e.args[0]} at index {index}")
我在做什么? iterrows
提供索引和行的元组。所以我正在对每一行项目进行地理编码。如果有效,我将结果添加到 address_list。与准确性相同。但是当它失败时,我会在错误列表中添加一条消息,以指示数据帧中发生错误的位置;即索引。但我还需要 address_list 中的占位符,所以我只添加 None。所以现在我可以做
customers['addresses'] = address_list
customers['accuracy'] = accuracy_list
customers['accuracy_type'] = accuracy_type_list
并在需要时保存我的数据框。 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
查看 api docs,您需要一个字符串来表示各个地址组件列中的地址,如下所示:
location = client.geocode("1109 N Highland St, Arlington VA")
因此,要在您的 df
中获得这样的列,您可以将每个向量映射到一个字符串,然后使用简单的字符串连接生成一个字符串,然后将该字符串插入到您的 df
:
import pandas as pd
customers = pd.read_csv("example.csv", header=None)
customers['address_string'] = customers[0].map(str) + ' ' + customers[1].map(str) + customers[2].map(str)
制作中:
# >>> customers['address_string']
# 0 21236 Birchwood Loop 99567 AK
# 1 1731 Bragaw St 99508 AK
# 2 300 E Fireweed Ln 99503 AK
# 3 4360 Snider Dr 99654 AK
# 4 1921 W Dimond Blvd 108 99515 AK
然后您可以迭代地址字符串系列的值并将精度存储在一个列表中,该列表可以插入您的 df
:
geocoded_acuracy = []
geocoded_acuracy_type = []
for address in customers['address_string'].values:
geocoded_address = client.geocode(address)
accuracy = geocoded_address.best_match.get("accuracy")
accuracy_type = geocoded_address.best_match.get("accuracy_type")
geocoded_acuracy.append(accuracy)
geocoded_acuracy_type.append(accuracy_type)
customers['accuracy'] = geocoded_acuracy
customers['accuracy_type'] = geocoded_acuracy_type
results = customers[['address_string', 'accuracy', 'accuracy_type']]
结果 df
将如下所示:
# >>> results
# address_string accuracy accuracy_type
# 0 21236 Birchwood Loop 99567 AK 1.00 rooftop
# 1 1731 Bragaw St 99508 AK 1.00 rooftop
# 2 300 E Fireweed Ln 99503 AK 1.00 rooftop
# 3 4360 Snider Dr 99654 AK 1.00 range_interpolation
# 4 1921 W Dimond Blvd 108 99515 AK 1.00 rooftop
# 5 2702 Peger Rd 99709 AK 1.00 rooftop
# 6 1651 College Rd 99709 AK 1.00 rooftop
# 7 898 Ballaine Rd 99709 AK 1.00 rooftop
# 8 23819 Immelman Circle 99567 AK 1.00 rooftop
# 9 9750 W Parks Hwy 99652 AK 0.33 place
# 10 7205 Shorewood Dr 99645 AK 1.00 range_interpolation
然后将结果df
写入一个.csv
:
results.to_csv('results.csv')
将所有这些放在一起产生以下代码:
import pandas as pd
from geocodio import GeocodioClient
API_KEY = 'insert_your_key_here'
client = GeocodioClient(API_KEY)
customers = pd.read_csv("example.csv", header=None)
customers['address_string'] = customers[0].map(str) + ' ' + customers[1].map(str) + customers[2].map(str)
geocoded_acuracy = []
geocoded_acuracy_type = []
for address in customers['address_string'].values:
geocoded_address = client.geocode(address)
accuracy = geocoded_address.best_match.get("accuracy")
accuracy_type = geocoded_address.best_match.get("accuracy_type")
geocoded_acuracy.append(accuracy)
geocoded_acuracy_type.append(accuracy_type)
customers['accuracy'] = geocoded_acuracy
customers['accuracy_type'] = geocoded_acuracy_type
results = customers[['address_string', 'accuracy', 'accuracy_type']]
results.to_csv('results.csv')