Pandas rsplit with if contains
Pandas rsplit with if contains
Using python with if contains(r'/' and r'\') and rsplit(str,1) I can
separate the values. But using Pandas is not working.
How can I get this result using pandas?
"PATH_IN","PATH_OUT"
"C:\USER\ARON\TESTE.TXT","C:\OUT\TESTE.TXT"
"SOUP.TXT","SOUP.TXT"
"/OPT/IN/TESTE.TXT","TESTE.TXT"
结果
"PATH_IN","NAME_IN","PATH_OUT","NAME_OUT"
"C:\USER\ARON","TESTE.TXT","C:\OUT","TESTE.TXT"
"","SOUP.TXT","","SOUP.TXT"
"/OPT/IN/","TESTE.TXT","","TESTE.TXT"
试试这个:
df['NAME_IN'] = df['PATH_IN'].str.split(r'[/\]').str[-1]
df['NAME_OUT'] = df['PATH_OUT'].str.split(r'[/\]').str[-1]
输出:
>>> df
PATH_IN PATH_OUT NAME_IN NAME_OUT
0 C:\USER\ARON\TESTE.TXT C:\OUT\TESTE.TXT TESTE.TXT TESTE.TXT
1 SOUP.TXT SOUP.TXT SOUP.TXT SOUP.TXT
2 /OPT/IN/TESTE.TXT TESTE.TXT TESTE.TXT TESTE.TXT
一种方法是避免正则表达式并使用os.path.split
,因为它可以处理文件名的多个分隔符。
df[["PATH_IN", "NAME_IN"]] = df["PATH_IN"].apply(lambda x: pd.Series(os.path.split(x)))
df[["PATH_OUT", "NAME_OUT"]] = df["PATH_OUT"].apply(lambda x: pd.Series(os.path.split(x)))
输出:
PATH_IN PATH_OUT NAME_IN NAME_OUT
0 C:\USER\ARON C:\OUT TESTE.TXT TESTE.TXT
1 SOUP.TXT SOUP.TXT
2 /OPT/IN TESTE.TXT TESTE.TXT
Using python with if contains(r'/' and r'\') and rsplit(str,1) I can separate the values. But using Pandas is not working. How can I get this result using pandas?
"PATH_IN","PATH_OUT"
"C:\USER\ARON\TESTE.TXT","C:\OUT\TESTE.TXT"
"SOUP.TXT","SOUP.TXT"
"/OPT/IN/TESTE.TXT","TESTE.TXT"
结果
"PATH_IN","NAME_IN","PATH_OUT","NAME_OUT"
"C:\USER\ARON","TESTE.TXT","C:\OUT","TESTE.TXT"
"","SOUP.TXT","","SOUP.TXT"
"/OPT/IN/","TESTE.TXT","","TESTE.TXT"
试试这个:
df['NAME_IN'] = df['PATH_IN'].str.split(r'[/\]').str[-1]
df['NAME_OUT'] = df['PATH_OUT'].str.split(r'[/\]').str[-1]
输出:
>>> df
PATH_IN PATH_OUT NAME_IN NAME_OUT
0 C:\USER\ARON\TESTE.TXT C:\OUT\TESTE.TXT TESTE.TXT TESTE.TXT
1 SOUP.TXT SOUP.TXT SOUP.TXT SOUP.TXT
2 /OPT/IN/TESTE.TXT TESTE.TXT TESTE.TXT TESTE.TXT
一种方法是避免正则表达式并使用os.path.split
,因为它可以处理文件名的多个分隔符。
df[["PATH_IN", "NAME_IN"]] = df["PATH_IN"].apply(lambda x: pd.Series(os.path.split(x)))
df[["PATH_OUT", "NAME_OUT"]] = df["PATH_OUT"].apply(lambda x: pd.Series(os.path.split(x)))
输出:
PATH_IN PATH_OUT NAME_IN NAME_OUT
0 C:\USER\ARON C:\OUT TESTE.TXT TESTE.TXT
1 SOUP.TXT SOUP.TXT
2 /OPT/IN TESTE.TXT TESTE.TXT