循环获取多个作者的隶属关系信息
Get affiliation information from multiple authors in a loop
目前正在使用 pybliometrics (scopus),我想创建一个循环,让我从多个作者那里获取隶属关系信息。
基本上,这就是我循环的想法。我如何与许多作者一起做到这一点?
from pybliometrics.scopus import AuthorRetrieval
import pandas as pd
import numpy as np
au = AuthorRetrieval(authorid)
au.affiliation_history
au.identifier
x = au.identifier
refs2 = au.affiliation_history
len(refs2)
refs2
df = pd.DataFrame(refs2)
df.columns
a_history = df
df['authorid'] = x
#moving authorid to 0
cols = list(df)
cols.insert(0, cols.pop(cols.index('authorid')))
df = df.loc[:, cols]
df.to_excel("af_historyfinal.xlsx")
将您的代码转换为多个作者 ID 的循环?没有比这更容易的了。假设 AUTHOR_IDS
等于 7004212771 和 57209617104:
import pandas as pd
from pybliometrics.scopus import AuthorRetrieval
def retrieve_affiliations(auth_id):
"""Author's affiliation history from Scopus as DataFrame."""
au = AuthorRetrieval(authorid)
df = pd.DataFrame(au.affiliation_history)
df["auth_id"] = au.identifier
return df
AUTHOR_IDS = [7004212771, 57209617104]
# Option 1, for few IDs
df = pd.concat([retrieve_affiliations(a) for a in AUTHOR_IDS])
# Option 2, for many IDs
df = pd.DataFrame():
for a in AUTHOR_IDS:
df = df.append(retrieve_affiliations(a))
# Have author ID as first column
df = df.set_index("authorid").reset_index()
df.to_excel("af_historyfinal.xlsx", index=False)
比如说,如果您的 ID 在名为“input.csv”的逗号分隔文件中,其中有一列名为“作者”,那么您从
开始
AUTHOR_IDS = pd.read_csv("input.csv")["authors"].unique()
目前正在使用 pybliometrics (scopus),我想创建一个循环,让我从多个作者那里获取隶属关系信息。
基本上,这就是我循环的想法。我如何与许多作者一起做到这一点?
from pybliometrics.scopus import AuthorRetrieval
import pandas as pd
import numpy as np
au = AuthorRetrieval(authorid)
au.affiliation_history
au.identifier
x = au.identifier
refs2 = au.affiliation_history
len(refs2)
refs2
df = pd.DataFrame(refs2)
df.columns
a_history = df
df['authorid'] = x
#moving authorid to 0
cols = list(df)
cols.insert(0, cols.pop(cols.index('authorid')))
df = df.loc[:, cols]
df.to_excel("af_historyfinal.xlsx")
将您的代码转换为多个作者 ID 的循环?没有比这更容易的了。假设 AUTHOR_IDS
等于 7004212771 和 57209617104:
import pandas as pd
from pybliometrics.scopus import AuthorRetrieval
def retrieve_affiliations(auth_id):
"""Author's affiliation history from Scopus as DataFrame."""
au = AuthorRetrieval(authorid)
df = pd.DataFrame(au.affiliation_history)
df["auth_id"] = au.identifier
return df
AUTHOR_IDS = [7004212771, 57209617104]
# Option 1, for few IDs
df = pd.concat([retrieve_affiliations(a) for a in AUTHOR_IDS])
# Option 2, for many IDs
df = pd.DataFrame():
for a in AUTHOR_IDS:
df = df.append(retrieve_affiliations(a))
# Have author ID as first column
df = df.set_index("authorid").reset_index()
df.to_excel("af_historyfinal.xlsx", index=False)
比如说,如果您的 ID 在名为“input.csv”的逗号分隔文件中,其中有一列名为“作者”,那么您从
开始AUTHOR_IDS = pd.read_csv("input.csv")["authors"].unique()