从模式字符串列表中提取字符串并将其转换为 python 中的数据帧

extract strings from patterned string list and convert it into dataFrame in python

我有一个列表,其中包含这样的模式化字符串:

['"Bandcamp" (2014)\t\t\t\t\ttv-mini-series',
'"ByMySide" (2012){The Happening (#1.3)}\t\t\t\t\ttwitter-hashtag-in-title',
'"Elmira" (2014)\t\t\t\t\telmira-new-york',
'"Elmira" (2014){The Happening (#1.3)}\t\t\tfriend',
...]

现在,我正在尝试从每一行中提取子字符串,并将它们制作成一个数据框,例如:

Movie    Year Keyword
Bandcamp 2014 tv-mini-series
ByMySide 2012 twitter-hashtag-in-title
Elmira   2014 elmira-new-york
Elmira   2014 friend
...

给你:

>>> a
['"Bandcamp" (2014)\t\t\t\t\ttv-mini-series', '"ByMySide" (2012){The Happening (#1.3)}\t\t\t\t\ttwitter-hashtag-in-title', '"Elmira" (2014)\t\t\t\t\telmira-new-york', '"Elmira" (2014){The Happening (#1.3)}\t\t\tfriend']
>>> data = []
>>> for x in a:
...     data.append(re.findall("\"(\w+)\" \((\d+)\).*\t{2,5}(\S+)", x)[0])
... 
>>> import pandas as pd
>>> pd.DataFrame(data, columns=['Movie', 'Year', 'Keyword'])
      Movie  Year                   Keyword
0  Bandcamp  2014            tv-mini-series
1  ByMySide  2012  twitter-hashtag-in-title
2    Elmira  2014           elmira-new-york
3    Elmira  2014                    friend