Python grep 和剪切
Python grep and cut
在Linux中,使用grep、cut、awk等工具可以很容易地得到某个字符串。
wolf@linux:~$ cat shver
cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.
Processor board ID FCW2049G03S
2048K bytes of non-volatile configuration memory.
8388608K bytes of physical memory.
1638400K bytes of Crash Files at crashinfo:.
11264000K bytes of Flash at flash:.
0K bytes of WebUI ODM Files at webui:.
Model Number : C9300-48P
Base Ethernet MAC Address : 04:6c:9d:01:3b:80
Motherboard Assembly Number : 73-17956-04
Motherboard Serial Number : FOC20465ABU
Model Revision Number : P4B
Motherboard Revision Number : 04
Model Number : C9300-48P
System Serial Number : FCW2049G03S
wolf@linux:~$
grep 和剪切
wolf@linux:~$ grep 'Model Number' shver | cut -d : -f 2
C9300-48P
C9300-48P
wolf@linux:~$
删除多余的space(如果有更好的解决方案,请告诉我)
wolf@linux:~$ grep 'Model Number' shver | cut -d : -f 2 | cut -d ' ' -f 2
C9300-48P
C9300-48P
wolf@linux:~$
Select 第一个输出
wolf@linux:~$ grep 'Model Number' shver | cut -d : -f 2 | cut -d ' ' -f 2 | head -1
C9300-48P
wolf@linux:~$
那是在 Linux 中。我打算在 Python.
中编写类似的代码
我的尝试目前还没有成功。
定义 shver 字符串
>>> shver = '''cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.
... Processor board ID FCW2049G03S
... 2048K bytes of non-volatile configuration memory.
... 8388608K bytes of physical memory.
... 1638400K bytes of Crash Files at crashinfo:.
... 11264000K bytes of Flash at flash:.
... 0K bytes of WebUI ODM Files at webui:.
... Model Number : C9300-48P
...
... Base Ethernet MAC Address : 04:6c:9d:01:3b:80
... Motherboard Assembly Number : 73-17956-04
... Motherboard Serial Number : FOC20465ABU
... Model Revision Number : P4B
... Motherboard Revision Number : 04
... Model Number : C9300-48P
... System Serial Number : FCW2049G03S
... '''
>>>
验证一下
>>> shver
'cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.\nProcessor board ID FCW2049G03S\n2048K bytes of non-volatile configuration memory.\n8388608K bytes of physical memory.\n1638400K bytes of Crash Files at crashinfo:.\n11264000K bytes of Flash at flash:.\n0K bytes of WebUI ODM Files at webui:.\nModel Number : C9300-48P\n\nBase Ethernet MAC Address : 04:6c:9d:01:3b:80\nMotherboard Assembly Number : 73-17956-04\nMotherboard Serial Number : FOC20465ABU\nModel Revision Number : P4B\nMotherboard Revision Number : 04\nModel Number : C9300-48P\nSystem Serial Number : FCW2049G03S\n'
>>>
创建列表
>>> shver_list = shver.splitlines()
>>> shver_list
['cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.', 'Processor board ID FCW2049G03S', '2048K bytes of non-volatile configuration memory.', '8388608K bytes of physical memory.', '1638400K bytes of Crash Files at crashinfo:.', '11264000K bytes of Flash at flash:.', '0K bytes of WebUI ODM Files at webui:.', 'Model Number : C9300-48P', '', 'Base Ethernet MAC Address : 04:6c:9d:01:3b:80', 'Motherboard Assembly Number : 73-17956-04', 'Motherboard Serial Number : FOC20465ABU', 'Model Revision Number : P4B', 'Motherboard Revision Number : 04', 'Model Number : C9300-48P', 'System Serial Number : FCW2049G03S']
>>>
下一步是找出字符串 'Model Number' 是否存在并打印出该行
>>> if 'Model Number' in shver_list:
... 'yes'
... else:
... 'no'
...
'no'
>>>
如何打印出包含 'Model Number' 的行?
>>> for i in shver_list:
... if 'Model Number' in shver_list:
... i
...
>>>
期望的输出
C9300-48P
您可以将数据解析成字典,这样可以很容易地访问相关信息:
shver = '''cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.
Processor board ID FCW2049G03S
2048K bytes of non-volatile configuration memory.
8388608K bytes of physical memory.
1638400K bytes of Crash Files at crashinfo:.
11264000K bytes of Flash at flash:.
0K bytes of WebUI ODM Files at webui:.
Model Number : C9300-48P
Base Ethernet MAC Address : 04:6c:9d:01:3b:80
Motherboard Assembly Number : 73-17956-04
Motherboard Serial Number : FOC20465ABU
Model Revision Number : P4B
Motherboard Revision Number : 04
Model Number : C9300-48P
System Serial Number : FCW2049G03S
'''
attributes = {}
for line in shver.splitlines():
if ':' in line: # we just take lines that contain a colon
item, value = line.strip().split(':', 1) # Split at the first colon!
attributes[item.strip()] = value.strip() # remove all whitespaces
print(attributes['Model Number'])
print(attributes['System Serial Number'])
输出:
C9300-48P
FCW2049G03S
您必须过滤列表中的条目。如果你直接过滤,你正在寻找字面意思是 "Model Number".
的行
这将打印包含子字符串 "Model Number".
的所有行
modelnumbers = [line for line in shver_list if 'Model Number' in line]
print(modelnumbers)`
要获得所需的输出,请使用第一个结果,并去除不需要的所有内容。
print(modelnumbers[0].split(":")[1].strip())
从您的 shver_list
步骤,您可以这样做:
for item in shver_list:
if "Model Number" in item:
break
所以一旦找到匹配项,它就会跳出循环并查看我们在 item
:
中的内容
>>> item
'Model Number : C9300-48P'
现在我们可以在 :
上拆分它(注意周围的空格)并得到:
>>> items.split(" : ")
['Model Number ', 'C9300-48P']
因此所需的元素位于此列表中的第一个位置。
总而言之:
for item in shver_list:
if "Model Number" in item:
break
desired = item.split(" : ")[1]
另一种方法是使用正则表达式。这次我们取自 shver
string:
import re
matches_gen = re.finditer(r"Model Number\s+:\s*(.+)", shver)
desired = next(matches_gen).group(1)
我们使用 finditer
进行惰性评估,因为我们想要第一次出现,所以我们在其上使用一个 next
并取匹配组以获得所需的结果 C9300-48P
。
在Linux中,使用grep、cut、awk等工具可以很容易地得到某个字符串。
wolf@linux:~$ cat shver
cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.
Processor board ID FCW2049G03S
2048K bytes of non-volatile configuration memory.
8388608K bytes of physical memory.
1638400K bytes of Crash Files at crashinfo:.
11264000K bytes of Flash at flash:.
0K bytes of WebUI ODM Files at webui:.
Model Number : C9300-48P
Base Ethernet MAC Address : 04:6c:9d:01:3b:80
Motherboard Assembly Number : 73-17956-04
Motherboard Serial Number : FOC20465ABU
Model Revision Number : P4B
Motherboard Revision Number : 04
Model Number : C9300-48P
System Serial Number : FCW2049G03S
wolf@linux:~$
grep 和剪切
wolf@linux:~$ grep 'Model Number' shver | cut -d : -f 2
C9300-48P
C9300-48P
wolf@linux:~$
删除多余的space(如果有更好的解决方案,请告诉我)
wolf@linux:~$ grep 'Model Number' shver | cut -d : -f 2 | cut -d ' ' -f 2
C9300-48P
C9300-48P
wolf@linux:~$
Select 第一个输出
wolf@linux:~$ grep 'Model Number' shver | cut -d : -f 2 | cut -d ' ' -f 2 | head -1
C9300-48P
wolf@linux:~$
那是在 Linux 中。我打算在 Python.
中编写类似的代码我的尝试目前还没有成功。
定义 shver 字符串
>>> shver = '''cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.
... Processor board ID FCW2049G03S
... 2048K bytes of non-volatile configuration memory.
... 8388608K bytes of physical memory.
... 1638400K bytes of Crash Files at crashinfo:.
... 11264000K bytes of Flash at flash:.
... 0K bytes of WebUI ODM Files at webui:.
... Model Number : C9300-48P
...
... Base Ethernet MAC Address : 04:6c:9d:01:3b:80
... Motherboard Assembly Number : 73-17956-04
... Motherboard Serial Number : FOC20465ABU
... Model Revision Number : P4B
... Motherboard Revision Number : 04
... Model Number : C9300-48P
... System Serial Number : FCW2049G03S
... '''
>>>
验证一下
>>> shver
'cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.\nProcessor board ID FCW2049G03S\n2048K bytes of non-volatile configuration memory.\n8388608K bytes of physical memory.\n1638400K bytes of Crash Files at crashinfo:.\n11264000K bytes of Flash at flash:.\n0K bytes of WebUI ODM Files at webui:.\nModel Number : C9300-48P\n\nBase Ethernet MAC Address : 04:6c:9d:01:3b:80\nMotherboard Assembly Number : 73-17956-04\nMotherboard Serial Number : FOC20465ABU\nModel Revision Number : P4B\nMotherboard Revision Number : 04\nModel Number : C9300-48P\nSystem Serial Number : FCW2049G03S\n'
>>>
创建列表
>>> shver_list = shver.splitlines()
>>> shver_list
['cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.', 'Processor board ID FCW2049G03S', '2048K bytes of non-volatile configuration memory.', '8388608K bytes of physical memory.', '1638400K bytes of Crash Files at crashinfo:.', '11264000K bytes of Flash at flash:.', '0K bytes of WebUI ODM Files at webui:.', 'Model Number : C9300-48P', '', 'Base Ethernet MAC Address : 04:6c:9d:01:3b:80', 'Motherboard Assembly Number : 73-17956-04', 'Motherboard Serial Number : FOC20465ABU', 'Model Revision Number : P4B', 'Motherboard Revision Number : 04', 'Model Number : C9300-48P', 'System Serial Number : FCW2049G03S']
>>>
下一步是找出字符串 'Model Number' 是否存在并打印出该行
>>> if 'Model Number' in shver_list:
... 'yes'
... else:
... 'no'
...
'no'
>>>
如何打印出包含 'Model Number' 的行?
>>> for i in shver_list:
... if 'Model Number' in shver_list:
... i
...
>>>
期望的输出
C9300-48P
您可以将数据解析成字典,这样可以很容易地访问相关信息:
shver = '''cisco C9300-48P (X86) processor with 818597K/6147K bytes of memory.
Processor board ID FCW2049G03S
2048K bytes of non-volatile configuration memory.
8388608K bytes of physical memory.
1638400K bytes of Crash Files at crashinfo:.
11264000K bytes of Flash at flash:.
0K bytes of WebUI ODM Files at webui:.
Model Number : C9300-48P
Base Ethernet MAC Address : 04:6c:9d:01:3b:80
Motherboard Assembly Number : 73-17956-04
Motherboard Serial Number : FOC20465ABU
Model Revision Number : P4B
Motherboard Revision Number : 04
Model Number : C9300-48P
System Serial Number : FCW2049G03S
'''
attributes = {}
for line in shver.splitlines():
if ':' in line: # we just take lines that contain a colon
item, value = line.strip().split(':', 1) # Split at the first colon!
attributes[item.strip()] = value.strip() # remove all whitespaces
print(attributes['Model Number'])
print(attributes['System Serial Number'])
输出:
C9300-48P
FCW2049G03S
您必须过滤列表中的条目。如果你直接过滤,你正在寻找字面意思是 "Model Number".
的行这将打印包含子字符串 "Model Number".
的所有行modelnumbers = [line for line in shver_list if 'Model Number' in line]
print(modelnumbers)`
要获得所需的输出,请使用第一个结果,并去除不需要的所有内容。
print(modelnumbers[0].split(":")[1].strip())
从您的 shver_list
步骤,您可以这样做:
for item in shver_list:
if "Model Number" in item:
break
所以一旦找到匹配项,它就会跳出循环并查看我们在 item
:
>>> item
'Model Number : C9300-48P'
现在我们可以在 :
上拆分它(注意周围的空格)并得到:
>>> items.split(" : ")
['Model Number ', 'C9300-48P']
因此所需的元素位于此列表中的第一个位置。
总而言之:
for item in shver_list:
if "Model Number" in item:
break
desired = item.split(" : ")[1]
另一种方法是使用正则表达式。这次我们取自 shver
string:
import re
matches_gen = re.finditer(r"Model Number\s+:\s*(.+)", shver)
desired = next(matches_gen).group(1)
我们使用 finditer
进行惰性评估,因为我们想要第一次出现,所以我们在其上使用一个 next
并取匹配组以获得所需的结果 C9300-48P
。