Pandas:select 行如何基于前一行中的字符串 - 应该是一个简单的解决方案
Pandas: how select row based on string in previous row - should be a simple solution
我有一个 csv 文件。如何打印具有特定字符串的行之后的行?我需要打印其中包含“ixation”的所有行,然后打印此行之后的行。
这是我当前的代码
df = pd.read_csv('locationof.csv')
df = pd.DataFrame(data, columns = ['Trial', 'Code','Time','Duration'])
list1 = ['100_1to3_start','fixation','Fixation','66_1to3_start']
contain_values = df[df['Code'].str.contains('|'.join(list1), na=False)]
这是我当前的输出...
2 1.0 fixation_dummy 50637.0 25086.0
4 2.0 fixation_dummy 75889.0 25086.0
7 3.0 fixation_dummy 101141.0 25086.0
9 4.0 fixation_dummy 126393.0 25086.0
13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
23 9.0 100_1to3_start_2456_4_0_2054 216104.0 24587.0
33 12.0 100_1to3_start_1507_7_0_2446 283885.0 15118.0
43 15.0 Fixation 332229.0 130081.0
55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
76 23.0 66_1to3_start_1539_8_0_2518 615150.0 15285.0
82 25.0 Fixation 654357.0 130081.0
123 35.0 Fixation 996089.0 130081.0
164 45.0 Fixation 1343635.0 130081.0
174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
197 53.0 66_1to3_start_1541_8_0_2545 1621074.0 15284.0
204 55.0 Fixation 1662939.0 130080.0
213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
223 59.0 100_1to3_start_1892_4_0_2544 1859638.0 18939.0
233 62.0 100_1to3_start_2315_7_0_2537 1918282.0 23259.0
但是我想要...
2 1.0 fixation_dummy 50637.0 25086.0
4 2.0 fixation_dummy 75889.0 25086.0
7 3.0 fixation_dummy 101141.0 25086.0
9 4.0 fixation_dummy 126393.0 25086.0
13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
43 15.0 Fixation 332229.0 130081.0
55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
82 25.0 Fixation 654357.0 130081.0
123 35.0 Fixation 996089.0 130081.0
164 45.0 Fixation 1343635.0 130081.0
174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
204 55.0 Fixation 1662939.0 130080.0
213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
如何只打印出仅跟在带有“ixation”的行后面的行 66_1to3.., 100_1to3...)?此代码将 运行 覆盖一系列 csv 文件,其中我需要的确切行因 csv 文件而异。
尝试使用 shift
进行布尔索引,因为我们只关心“ixation”之后的行
list1 = ['100_1to3_start', '66_1to3_start']
df[df[2].str.contains('|'.join(list1), na=False) & df[2].shift().str.contains('ixation')]
0 1 2 3 4
4 13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
8 55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
13 174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
16 213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
请注意,根据您的示例,df[2]
将是 df['Code']
要回答此描述:“我需要打印其中包含“ixation”的所有行,然后是该行之后的行。”,解决方案是:
# identify rows with "ixation"
mask = df['Code'].str.contains('ixation')
# select them and one row below
out = df[mask|mask.shift()]
我有一个 csv 文件。如何打印具有特定字符串的行之后的行?我需要打印其中包含“ixation”的所有行,然后打印此行之后的行。
这是我当前的代码
df = pd.read_csv('locationof.csv')
df = pd.DataFrame(data, columns = ['Trial', 'Code','Time','Duration'])
list1 = ['100_1to3_start','fixation','Fixation','66_1to3_start']
contain_values = df[df['Code'].str.contains('|'.join(list1), na=False)]
这是我当前的输出...
2 1.0 fixation_dummy 50637.0 25086.0
4 2.0 fixation_dummy 75889.0 25086.0
7 3.0 fixation_dummy 101141.0 25086.0
9 4.0 fixation_dummy 126393.0 25086.0
13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
23 9.0 100_1to3_start_2456_4_0_2054 216104.0 24587.0
33 12.0 100_1to3_start_1507_7_0_2446 283885.0 15118.0
43 15.0 Fixation 332229.0 130081.0
55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
76 23.0 66_1to3_start_1539_8_0_2518 615150.0 15285.0
82 25.0 Fixation 654357.0 130081.0
123 35.0 Fixation 996089.0 130081.0
164 45.0 Fixation 1343635.0 130081.0
174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
197 53.0 66_1to3_start_1541_8_0_2545 1621074.0 15284.0
204 55.0 Fixation 1662939.0 130080.0
213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
223 59.0 100_1to3_start_1892_4_0_2544 1859638.0 18939.0
233 62.0 100_1to3_start_2315_7_0_2537 1918282.0 23259.0
但是我想要...
2 1.0 fixation_dummy 50637.0 25086.0
4 2.0 fixation_dummy 75889.0 25086.0
7 3.0 fixation_dummy 101141.0 25086.0
9 4.0 fixation_dummy 126393.0 25086.0
13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
43 15.0 Fixation 332229.0 130081.0
55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
82 25.0 Fixation 654357.0 130081.0
123 35.0 Fixation 996089.0 130081.0
164 45.0 Fixation 1343635.0 130081.0
174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
204 55.0 Fixation 1662939.0 130080.0
213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
如何只打印出仅跟在带有“ixation”的行后面的行 66_1to3.., 100_1to3...)?此代码将 运行 覆盖一系列 csv 文件,其中我需要的确切行因 csv 文件而异。
尝试使用 shift
进行布尔索引,因为我们只关心“ixation”之后的行
list1 = ['100_1to3_start', '66_1to3_start']
df[df[2].str.contains('|'.join(list1), na=False) & df[2].shift().str.contains('ixation')]
0 1 2 3 4
4 13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
8 55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
13 174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
16 213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
请注意,根据您的示例,df[2]
将是 df['Code']
要回答此描述:“我需要打印其中包含“ixation”的所有行,然后是该行之后的行。”,解决方案是:
# identify rows with "ixation"
mask = df['Code'].str.contains('ixation')
# select them and one row below
out = df[mask|mask.shift()]