当我想在 Python 中将某种语言(马来语)翻译成英语时,为什么输出 return NaN?
Why the output return NaN when I want to translate certain language (Malay) to English in Python?
首先,我使用FasText 进行语言检测。然后,从语言检测中,我想将某种语言(在本例中为马来语)翻译成英语。对于翻译部分,我使用Google Translate API using Python。问题是,其他语言(在本例中为英语和泰语)的输出 return NaN 值。我只想 return 只有翻译的文本,这是马来语。
from googletrans import Translator
import pandas as pd
import numpy as np
translator = Translator()
df = pd.DataFrame({
'text': ["how are you", "suka makan nasi ayam", "สวัสด","hai, apa khabar"],
'lang': ["english", "malay", "thai","malay"]
})
df
数据帧 df:
df1=df[df["lang"] == "malay"]
df['text'] = df1['text'].apply(translator.translate, dest='en').apply(getattr, args=('text',))
df
生成的输出:
期望输出:
text | lang
-----------------------------------
how are you | english
like to eat chicken rice | malay
สวัสด | thai
Hello how are you | malay
您需要使用布尔掩码:
translate_en = lambda x: translator.translate(x, dest='en').text
m = df['lang'] == 'malay'
df.loc[m, 'text'] = df.loc[m, 'text'].apply(translate_en)
print(df)
# Output
text lang
0 how are you english
1 like to eat chicken rice malay
2 สวัสด thai
3 Hello how are you malay
与update
相同:
df.update(df.loc[m, 'text'].apply(translate_en))
print(df)
# Output
text lang
0 how are you english
1 like to eat chicken rice malay
2 สวัสด thai
3 Hello how are you malay
首先,我使用FasText 进行语言检测。然后,从语言检测中,我想将某种语言(在本例中为马来语)翻译成英语。对于翻译部分,我使用Google Translate API using Python。问题是,其他语言(在本例中为英语和泰语)的输出 return NaN 值。我只想 return 只有翻译的文本,这是马来语。
from googletrans import Translator
import pandas as pd
import numpy as np
translator = Translator()
df = pd.DataFrame({
'text': ["how are you", "suka makan nasi ayam", "สวัสด","hai, apa khabar"],
'lang': ["english", "malay", "thai","malay"]
})
df
数据帧 df:
df1=df[df["lang"] == "malay"]
df['text'] = df1['text'].apply(translator.translate, dest='en').apply(getattr, args=('text',))
df
生成的输出:
期望输出:
text | lang
-----------------------------------
how are you | english
like to eat chicken rice | malay
สวัสด | thai
Hello how are you | malay
您需要使用布尔掩码:
translate_en = lambda x: translator.translate(x, dest='en').text
m = df['lang'] == 'malay'
df.loc[m, 'text'] = df.loc[m, 'text'].apply(translate_en)
print(df)
# Output
text lang
0 how are you english
1 like to eat chicken rice malay
2 สวัสด thai
3 Hello how are you malay
与update
相同:
df.update(df.loc[m, 'text'].apply(translate_en))
print(df)
# Output
text lang
0 how are you english
1 like to eat chicken rice malay
2 สวัสด thai
3 Hello how are you malay