Pandas 如何从 JSON 索引列表并将其放入数据框中?
Pandas how to index a a list from JSON and put it into a dataframe?
如何在数据框中索引列表?
我这里有这段代码,可以从 JSON 获取数据并将其插入数据框
这是 JSON 的样子
{"text_sentiment": "positive", "text_probability": [0.33917574607174916, 0.26495590980799744, 0.3958683441202534]}
这是我的代码。
input_c = pd.DataFrame(columns=['Comments','Result'])
for i in range(input_df.shape[0]):
url = 'http://classify/?text='+str(input_df.iloc[i])
r = requests.get(url)
result = r.json()["text_sentiment"]
proba = r.json()["text_probability"]
input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': proba}, ignore_index = True)
st.write(input_c)
结果如下
result
Comments Result Probability
0 This movie is good in my eyes. neutral [0.26361889609129974, 0.4879752378104797, 0.2484058660982205]
1 This is a bad movie it's not good. negative [0.5210904912792065, 0.22073131008688818, 0.25817819863390534]
2 One of the best performance in this year. positive [0.14644707145500369, 0.3581522311734714, 0.49540069737152503]
3 The best movie i've ever seen. positive [0.1772046003747405, 0.026468108571479156, 0.7963272910537804]
4 The movie is meh. neutral [0.24349393167653663, 0.6820982528652574, 0.07440781545820596]
5 One of the best selling artist in the world. positive [0.07738688706903311, 0.3329095061233371, 0.5897036068076298]
Probability 列中的数据是我要索引的数据。
例如:如果Result中的值为“positive”那么我希望proba索引为2,如果结果为“neutral”索引为1
像这样
Comments Result Probability
0 This movie is good in my eyes. neutral [0.4879752378104797]
1 This is a bad movie it's not good. negative [0.5210904912792065]
2 One of the best performance in this year. positive [0.49540069737152503]
3 The best movie i've ever seen. positive [0.7963272910537804]
4 The movie is meh. neutral [0.6820982528652574]
5 One of the best selling artist in the world. positive [0.5897036068076298]
有什么方法可以做到吗?
在你的代码中,你已经决定了Result
的内容,是负面的、中性的还是正面的,所以你只需要将概率列表的最大值存储在数据框中input_c
.
意思是,把'Probability': proba
改成'Probability': max(proba)
,所以修改:
input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': proba}, ignore_index = True)
到
input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': max(proba}, ignore_index = True)
然后将 input_c
中的索引设置为 Probability
列,使用
input_c.set_index('Probability')
如何在数据框中索引列表?
我这里有这段代码,可以从 JSON 获取数据并将其插入数据框
这是 JSON 的样子
{"text_sentiment": "positive", "text_probability": [0.33917574607174916, 0.26495590980799744, 0.3958683441202534]}
这是我的代码。
input_c = pd.DataFrame(columns=['Comments','Result'])
for i in range(input_df.shape[0]):
url = 'http://classify/?text='+str(input_df.iloc[i])
r = requests.get(url)
result = r.json()["text_sentiment"]
proba = r.json()["text_probability"]
input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': proba}, ignore_index = True)
st.write(input_c)
结果如下 result
Comments Result Probability
0 This movie is good in my eyes. neutral [0.26361889609129974, 0.4879752378104797, 0.2484058660982205]
1 This is a bad movie it's not good. negative [0.5210904912792065, 0.22073131008688818, 0.25817819863390534]
2 One of the best performance in this year. positive [0.14644707145500369, 0.3581522311734714, 0.49540069737152503]
3 The best movie i've ever seen. positive [0.1772046003747405, 0.026468108571479156, 0.7963272910537804]
4 The movie is meh. neutral [0.24349393167653663, 0.6820982528652574, 0.07440781545820596]
5 One of the best selling artist in the world. positive [0.07738688706903311, 0.3329095061233371, 0.5897036068076298]
Probability 列中的数据是我要索引的数据。
例如:如果Result中的值为“positive”那么我希望proba索引为2,如果结果为“neutral”索引为1
像这样
Comments Result Probability
0 This movie is good in my eyes. neutral [0.4879752378104797]
1 This is a bad movie it's not good. negative [0.5210904912792065]
2 One of the best performance in this year. positive [0.49540069737152503]
3 The best movie i've ever seen. positive [0.7963272910537804]
4 The movie is meh. neutral [0.6820982528652574]
5 One of the best selling artist in the world. positive [0.5897036068076298]
有什么方法可以做到吗?
在你的代码中,你已经决定了Result
的内容,是负面的、中性的还是正面的,所以你只需要将概率列表的最大值存储在数据框中input_c
.
意思是,把'Probability': proba
改成'Probability': max(proba)
,所以修改:
input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': proba}, ignore_index = True)
到
input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': max(proba}, ignore_index = True)
然后将 input_c
中的索引设置为 Probability
列,使用
input_c.set_index('Probability')