如何用Idealista API获取房产数据?
How to get real estate data with Idealista API?
我一直在尝试使用 Idealista (https://www.idealista.com/) 网站的 API 来检索房地产数据信息。
由于我对OAuth2 不熟悉,所以到目前为止我一直无法获得令牌。我刚刚获得了 api 密钥、秘密和一些关于如何挂载 http 请求的基本信息。
我希望能举一个例子(最好在 Python 中)来说明这个 API 的功能,或者一些关于处理 OAuth2 和 Python.[=12 的更通用的信息=]
经过几天的研究,我想出了一个基本的 python 代码来从 Idealista API 检索房地产数据。
def get_oauth_token():
http_obj = Http()
url = "https://api.idealista.com/oauth/token"
apikey= urllib.parse.quote_plus('Provided_API_key')
secret= urllib.parse.quote_plus('Provided_API_secret')
auth = base64.encode(apikey + ':' + secret)
body = {'grant_type':'client_credentials'}
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8','Authorization' : 'Basic ' + auth}
resp, content = http_obj.request(url,method='POST',headers=headers, body=urllib.parse.urlencode(body))
return content
此函数将 return 一个带有 OAuth2 令牌和以秒为单位的会话时间的 JSON。之后查询API,就这么简单:
def search_api(token):
http_obj = Http()
url = "http://api.idealista.com/3.5/es/search?center=40.42938099999995,-3.7097526269835726&country=es&maxItems=50&numPage=1&distance=452&propertyType=bedrooms&operation=rent"
headers = {'Authorization' : 'Bearer ' + token}
resp, content = http_obj.request(url,method='POST',headers=headers)
return content
这次我们会在内容变量中找到我们正在寻找的数据,同样是 JSON。
这是我的代码,正在改进 #3...这个 运行 好的!为了我!!!!
只输入您的 apikey 和您的密码(秘密)...
import pandas as pd
import json
import urllib
import requests as rq
import base64
def get_oauth_token():
url = "https://api.idealista.com/oauth/token"
apikey= 'your_api_key' #sent by idealista
secret= 'your_password' #sent by idealista
auth = base64.b64encode(apikey + ':' + secret)
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8' ,'Authorization' : 'Basic ' + auth}
params = urllib.urlencode({'grant_type':'client_credentials'})
content = rq.post(url,headers = headers, params=params)
bearer_token = json.loads(content.text)['access_token']
return bearer_token
def search_api(token, url):
headers = {'Content-Type': 'Content-Type: multipart/form-data;', 'Authorization' : 'Bearer ' + token}
content = rq.post(url, headers = headers)
result = json.loads(content.text)['access_token']
return result
country = 'es' #values: es, it, pt
locale = 'es' #values: es, it, pt, en, ca
language = 'es' #
max_items = '50'
operation = 'sale'
property_type = 'homes'
order = 'priceDown'
center = '40.4167,-3.70325'
distance = '60000'
sort = 'desc'
bankOffer = 'false'
df_tot = pd.DataFrame()
limit = 10
for i in range(1,limit):
url = ('https://api.idealista.com/3.5/'+country+'/search?operation='+operation+#"&locale="+locale+
'&maxItems='+max_items+
'&order='+order+
'¢er='+center+
'&distance='+distance+
'&propertyType='+property_type+
'&sort='+sort+
'&numPage=%s'+
'&language='+language) %(i)
a = search_api(get_oauth_token(), url)
df = pd.DataFrame.from_dict(a['elementList'])
df_tot = pd.concat([df_tot,df])
df_tot = df_tot.reset_index()
我发现了一些错误。至少,我不能运行。
我相信,我改进了这个:
import pandas as pd
import json
import urllib
import requests as rq
import base64
def get_oauth_token():
url = "https://api.idealista.com/oauth/token"
apikey= 'your_api_key' #sent by idealist
secret= 'your_password' #sent by idealista
apikey_secret = apikey + ':' + secret
auth = str(base64.b64encode(bytes(apikey_secret, 'utf-8')))[2:][:-1]
headers = {'Authorization' : 'Basic ' + auth,'Content-Type': 'application/x-www-form-
urlencoded;charset=UTF-8'}
params = urllib.parse.urlencode({'grant_type':'client_credentials'}) #,'scope':'read'
content = rq.post(url,headers = headers, params=params)
bearer_token = json.loads(content.text)['access_token']
return bearer_token
def search_api(token, URL):
headers = {'Content-Type': 'Content-Type: multipart/form-data;', 'Authorization' : 'Bearer ' + token}
content = rq.post(url, headers = headers)
result = json.loads(content.text)
return result
无法标记为正确答案,因为
auth = base64.encode(apikey + ':' + secret)
body = {'grant_type':'client_credentials'}
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8','Authorization' : 'Basic ' + auth}
会给你类型错误:
can only concatenate str (not "bytes") to str
由于 base64encode returns 字节类型对象...
确实 Idealista API 在文档方面非常有限,但我认为这是一种更好的方法,因为我不使用不必要的库(仅限本机):
#first request
message = API_KEY + ":" + SECRET
auth = "Basic " + base64.b64encode(message.encode("ascii")).decode("ascii")
headers_dic = {"Authorization" : auth,
"Content-Type" : "application/x-www-form-urlencoded;charset=UTF-8"}
params_dic = {"grant_type" : "client_credentials",
"scope" : "read"}
r = requests.post("https://api.idealista.com/oauth/token",
headers = headers_dic,
params = params_dic)
只有 python 个请求和 base64 模块才能完美运行...
问候
我一直在尝试使用 Idealista (https://www.idealista.com/) 网站的 API 来检索房地产数据信息。
由于我对OAuth2 不熟悉,所以到目前为止我一直无法获得令牌。我刚刚获得了 api 密钥、秘密和一些关于如何挂载 http 请求的基本信息。
我希望能举一个例子(最好在 Python 中)来说明这个 API 的功能,或者一些关于处理 OAuth2 和 Python.[=12 的更通用的信息=]
经过几天的研究,我想出了一个基本的 python 代码来从 Idealista API 检索房地产数据。
def get_oauth_token():
http_obj = Http()
url = "https://api.idealista.com/oauth/token"
apikey= urllib.parse.quote_plus('Provided_API_key')
secret= urllib.parse.quote_plus('Provided_API_secret')
auth = base64.encode(apikey + ':' + secret)
body = {'grant_type':'client_credentials'}
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8','Authorization' : 'Basic ' + auth}
resp, content = http_obj.request(url,method='POST',headers=headers, body=urllib.parse.urlencode(body))
return content
此函数将 return 一个带有 OAuth2 令牌和以秒为单位的会话时间的 JSON。之后查询API,就这么简单:
def search_api(token):
http_obj = Http()
url = "http://api.idealista.com/3.5/es/search?center=40.42938099999995,-3.7097526269835726&country=es&maxItems=50&numPage=1&distance=452&propertyType=bedrooms&operation=rent"
headers = {'Authorization' : 'Bearer ' + token}
resp, content = http_obj.request(url,method='POST',headers=headers)
return content
这次我们会在内容变量中找到我们正在寻找的数据,同样是 JSON。
这是我的代码,正在改进 #3...这个 运行 好的!为了我!!!! 只输入您的 apikey 和您的密码(秘密)...
import pandas as pd
import json
import urllib
import requests as rq
import base64
def get_oauth_token():
url = "https://api.idealista.com/oauth/token"
apikey= 'your_api_key' #sent by idealista
secret= 'your_password' #sent by idealista
auth = base64.b64encode(apikey + ':' + secret)
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8' ,'Authorization' : 'Basic ' + auth}
params = urllib.urlencode({'grant_type':'client_credentials'})
content = rq.post(url,headers = headers, params=params)
bearer_token = json.loads(content.text)['access_token']
return bearer_token
def search_api(token, url):
headers = {'Content-Type': 'Content-Type: multipart/form-data;', 'Authorization' : 'Bearer ' + token}
content = rq.post(url, headers = headers)
result = json.loads(content.text)['access_token']
return result
country = 'es' #values: es, it, pt
locale = 'es' #values: es, it, pt, en, ca
language = 'es' #
max_items = '50'
operation = 'sale'
property_type = 'homes'
order = 'priceDown'
center = '40.4167,-3.70325'
distance = '60000'
sort = 'desc'
bankOffer = 'false'
df_tot = pd.DataFrame()
limit = 10
for i in range(1,limit):
url = ('https://api.idealista.com/3.5/'+country+'/search?operation='+operation+#"&locale="+locale+
'&maxItems='+max_items+
'&order='+order+
'¢er='+center+
'&distance='+distance+
'&propertyType='+property_type+
'&sort='+sort+
'&numPage=%s'+
'&language='+language) %(i)
a = search_api(get_oauth_token(), url)
df = pd.DataFrame.from_dict(a['elementList'])
df_tot = pd.concat([df_tot,df])
df_tot = df_tot.reset_index()
我发现了一些错误。至少,我不能运行。 我相信,我改进了这个:
import pandas as pd
import json
import urllib
import requests as rq
import base64
def get_oauth_token():
url = "https://api.idealista.com/oauth/token"
apikey= 'your_api_key' #sent by idealist
secret= 'your_password' #sent by idealista
apikey_secret = apikey + ':' + secret
auth = str(base64.b64encode(bytes(apikey_secret, 'utf-8')))[2:][:-1]
headers = {'Authorization' : 'Basic ' + auth,'Content-Type': 'application/x-www-form-
urlencoded;charset=UTF-8'}
params = urllib.parse.urlencode({'grant_type':'client_credentials'}) #,'scope':'read'
content = rq.post(url,headers = headers, params=params)
bearer_token = json.loads(content.text)['access_token']
return bearer_token
def search_api(token, URL):
headers = {'Content-Type': 'Content-Type: multipart/form-data;', 'Authorization' : 'Bearer ' + token}
content = rq.post(url, headers = headers)
result = json.loads(content.text)
return result
无法标记为正确答案,因为
auth = base64.encode(apikey + ':' + secret)
body = {'grant_type':'client_credentials'}
headers = {'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8','Authorization' : 'Basic ' + auth}
会给你类型错误:
can only concatenate str (not "bytes") to str
由于 base64encode returns 字节类型对象...
确实 Idealista API 在文档方面非常有限,但我认为这是一种更好的方法,因为我不使用不必要的库(仅限本机):
#first request
message = API_KEY + ":" + SECRET
auth = "Basic " + base64.b64encode(message.encode("ascii")).decode("ascii")
headers_dic = {"Authorization" : auth,
"Content-Type" : "application/x-www-form-urlencoded;charset=UTF-8"}
params_dic = {"grant_type" : "client_credentials",
"scope" : "read"}
r = requests.post("https://api.idealista.com/oauth/token",
headers = headers_dic,
params = params_dic)
只有 python 个请求和 base64 模块才能完美运行...
问候