如何解决错误字符串索引必须是文本扩展中的整数
How to solve error string indices must be integers in text expansion
当我在 Dataframe 上面 运行 时,它显示错误字符串索引必须是整数。我不知道如何解决这个问题。
这是我目前试过的代码
# Dictionary of English Contractions
contractions_dict = {"ain't": "are not","'s":" is","aren't": "are not",
"can't": "cannot","can't've": "cannot have",
"'cause": "because","could've": "could have","couldn't": "could not",
"couldn't've": "could not have", "didn't": "did not","doesn't": "does not",
"don't": "do not","hadn't": "had not","hadn't've": "had not have",
"hasn't": "has not","haven't": "have not","he'd": "he would",
"he'd've": "he would have","he'll": "he will", "he'll've": "he will have",
"how'd": "how did","how'd'y": "how do you","how'll": "how will",
"I'd": "I would", "I'd've": "I would have","I'll": "I will",
"I'll've": "I will have","I'm": "I am","I've": "I have", "isn't": "is not",
"it'd": "it would","it'd've": "it would have","it'll": "it will",
"it'll've": "it will have", "let's": "let us","ma'am": "madam",
"mayn't": "may not","might've": "might have","mightn't": "might not",
"mightn't've": "might not have","must've": "must have","mustn't": "must not",
"mustn't've": "must not have", "needn't": "need not",
"needn't've": "need not have","o'clock": "of the clock","oughtn't": "ought not",
"oughtn't've": "ought not have","shan't": "shall not","sha'n't": "shall not",
"shan't've": "shall not have","she'd": "she would","she'd've": "she would have",
"she'll": "she will", "she'll've": "she will have","should've": "should have",
"shouldn't": "should not", "shouldn't've": "should not have","so've": "so have",
"that'd": "that would","that'd've": "that would have", "there'd": "there would",
"there'd've": "there would have", "they'd": "they would"}
# Regular expression for finding contractions
contractions_re=re.compile('(%s)' % '|'.join(contractions_dict.keys()))
# Function for expanding contractions
def expand_contractions(text,contractions_dict=contractions_dict):
def replace(match):
return contractions_dict[match.group(0)]
# Expanding Contractions in the reviews
dataset['entitas bernama']=dataset['entitas bernama'].apply(lambda x:expand_contractions(x))
这就是错误
dataset['entitas bernama']=dataset['entitas bernama'].apply(lambda x:expand_contractions(x))
error : string indices must be integers
这就是您在 pandas
中替换系列值的方式
pandas.Series.replace(to_replace=contractions_dict, inplace=True, value=None, regex=True)
来自 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.replace.html:
Dicts can be used to specify different replacement values for different existing values. For example, {'a': 'b', 'y': 'z'}
replaces the value 'a'
with 'b
' and 'y'
with 'z'
. To use a dict in this way the value
parameter should be None
.
例子
contraction_dict = {...} # redacted
In []: twt = pd.read_csv('twitter4000.csv')
Out[]:
tweets sentiment
0 is bored and wants to watch a movie any sugge... 0
1. back in miami. waiting to unboard ship 0
2 @misskpey awwww dnt dis brng bak memoriessss, ... 0
3 ughhh i am so tired blahhhhhhhhh 0
4 @mandagoforth me bad! It's funny though. Zacha... 0
... ... ...
3995 i just graduated 1
3996 Templating works; it all has to be done 1
3997 mommy just brought me starbucks 1
3998 @omarepps watching you on a House re-run...lov... 1
3999 Thanks for trying to make me smile I'll make y... 1
4000 rows × 2 columns
# notice in a glance only the last row has contraction in head +5 tail -5
In []: # check which rows has contractions
twt[twt.tweets.str.contains('|'.join(contractions_dict.keys()), regex=True)]
Out[]:
tweets sentiment
2 @misskpey awwww dnt dis brng bak memoriessss, ... 0
4 @mandagoforth me bad! It's funny though. Zacha... 0
5 brr, i'm so cold. at the moment doing my assig... 0
6 @kevinmarquis haha yep but i really need to sl... 0
7 eating some ice-cream while I try to see @pete... 0
... ... ...
3961 gonna cousin's b.day. 1
3968 @kat_n Got to agree it's a risk to put her thr... 1
3983 About to watch the Lakers win game duece. I'm ... 1
3986 @countroshculla yeah..needed to get up early..... 1
3999 Thanks for trying to make me smile I'll make y... 1
937 rows × 2 columns
In []: twt.tail(5).tweets.replace(to_replace=contractions_dict, value=None, regex=True)
Out[]:
3995 i just graduated
3996 Templating works; it all has to be done
3997 mommy just brought me starbucks
3998 @omarepps watching you on a House re-run...lov...
3999 Thanks for trying to make me smile I will make...
Name: tweets, dtype: object
为 Series.replace
使用参数 inplace=True
以避免分配回 df 即 twt.tweets = twt.tweets.replace(...)
当我在 Dataframe 上面 运行 时,它显示错误字符串索引必须是整数。我不知道如何解决这个问题。
这是我目前试过的代码
# Dictionary of English Contractions
contractions_dict = {"ain't": "are not","'s":" is","aren't": "are not",
"can't": "cannot","can't've": "cannot have",
"'cause": "because","could've": "could have","couldn't": "could not",
"couldn't've": "could not have", "didn't": "did not","doesn't": "does not",
"don't": "do not","hadn't": "had not","hadn't've": "had not have",
"hasn't": "has not","haven't": "have not","he'd": "he would",
"he'd've": "he would have","he'll": "he will", "he'll've": "he will have",
"how'd": "how did","how'd'y": "how do you","how'll": "how will",
"I'd": "I would", "I'd've": "I would have","I'll": "I will",
"I'll've": "I will have","I'm": "I am","I've": "I have", "isn't": "is not",
"it'd": "it would","it'd've": "it would have","it'll": "it will",
"it'll've": "it will have", "let's": "let us","ma'am": "madam",
"mayn't": "may not","might've": "might have","mightn't": "might not",
"mightn't've": "might not have","must've": "must have","mustn't": "must not",
"mustn't've": "must not have", "needn't": "need not",
"needn't've": "need not have","o'clock": "of the clock","oughtn't": "ought not",
"oughtn't've": "ought not have","shan't": "shall not","sha'n't": "shall not",
"shan't've": "shall not have","she'd": "she would","she'd've": "she would have",
"she'll": "she will", "she'll've": "she will have","should've": "should have",
"shouldn't": "should not", "shouldn't've": "should not have","so've": "so have",
"that'd": "that would","that'd've": "that would have", "there'd": "there would",
"there'd've": "there would have", "they'd": "they would"}
# Regular expression for finding contractions
contractions_re=re.compile('(%s)' % '|'.join(contractions_dict.keys()))
# Function for expanding contractions
def expand_contractions(text,contractions_dict=contractions_dict):
def replace(match):
return contractions_dict[match.group(0)]
# Expanding Contractions in the reviews
dataset['entitas bernama']=dataset['entitas bernama'].apply(lambda x:expand_contractions(x))
这就是错误
dataset['entitas bernama']=dataset['entitas bernama'].apply(lambda x:expand_contractions(x))
error : string indices must be integers
这就是您在 pandas
中替换系列值的方式pandas.Series.replace(to_replace=contractions_dict, inplace=True, value=None, regex=True)
来自 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.replace.html:
Dicts can be used to specify different replacement values for different existing values. For example,
{'a': 'b', 'y': 'z'}
replaces the value'a'
with'b
' and'y'
with'z'
. To use a dict in this way thevalue
parameter should beNone
.
例子
contraction_dict = {...} # redacted
In []: twt = pd.read_csv('twitter4000.csv')
Out[]:
tweets sentiment
0 is bored and wants to watch a movie any sugge... 0
1. back in miami. waiting to unboard ship 0
2 @misskpey awwww dnt dis brng bak memoriessss, ... 0
3 ughhh i am so tired blahhhhhhhhh 0
4 @mandagoforth me bad! It's funny though. Zacha... 0
... ... ...
3995 i just graduated 1
3996 Templating works; it all has to be done 1
3997 mommy just brought me starbucks 1
3998 @omarepps watching you on a House re-run...lov... 1
3999 Thanks for trying to make me smile I'll make y... 1
4000 rows × 2 columns
# notice in a glance only the last row has contraction in head +5 tail -5
In []: # check which rows has contractions
twt[twt.tweets.str.contains('|'.join(contractions_dict.keys()), regex=True)]
Out[]:
tweets sentiment
2 @misskpey awwww dnt dis brng bak memoriessss, ... 0
4 @mandagoforth me bad! It's funny though. Zacha... 0
5 brr, i'm so cold. at the moment doing my assig... 0
6 @kevinmarquis haha yep but i really need to sl... 0
7 eating some ice-cream while I try to see @pete... 0
... ... ...
3961 gonna cousin's b.day. 1
3968 @kat_n Got to agree it's a risk to put her thr... 1
3983 About to watch the Lakers win game duece. I'm ... 1
3986 @countroshculla yeah..needed to get up early..... 1
3999 Thanks for trying to make me smile I'll make y... 1
937 rows × 2 columns
In []: twt.tail(5).tweets.replace(to_replace=contractions_dict, value=None, regex=True)
Out[]:
3995 i just graduated
3996 Templating works; it all has to be done
3997 mommy just brought me starbucks
3998 @omarepps watching you on a House re-run...lov...
3999 Thanks for trying to make me smile I will make...
Name: tweets, dtype: object
为 Series.replace
使用参数 inplace=True
以避免分配回 df 即 twt.tweets = twt.tweets.replace(...)