powershell 中的数据 manipulation\deduplication
Data manipulation\deduplication in powershell
嘿,我想删除一些数据的重复项并合并 CSV 中的列。无法理解如何去做。这是我正在使用的数据示例:
cmmc,stig,descr
AC.1.001,SV-205663r569188_rule,The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
AC.1.001,SV-205667r569188_rule,Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002,SV-205663r569188_rule,The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
AC.1.002,SV-205665r569188_rule,Enterprise Domain Controllers groups on domain controllers.
我非常接近我正在寻找的数据,但很难在第二列中的项目之后添加 |<value of 'descr'>
:
这是我的脚本:
Import-CSV '.\input.csv' | Group-Object 'cmmc' |
ForEach-Object {
[PsCustomObject]@{
cmmc = $_.name
stig = $_.group.stig -Join '
'
}
} | Export-Csv '.\output.csv' -NoTypeInformation
输出如下所示(为了便于阅读而格式化,省略了列名):
AC1.001 SV-205663r569188_rule
SV-205665r569188_rule
AC1.002 SV-205663r569188_rule
SV-205665r569188_rule
但我正在寻找这个:
AC.1.001 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers.
使用下面的,它利用了 in combination with the Select-Object
cmdlet applied to the results from your Group-Object
调用:
Import-Csv .\input.csv |
Group-Object cmmc |
Select-Object @{ Name = 'cmmc'; e = 'Name' },
@{ Name = 'stig_descr'; e = {
[array] $stigs, [array] $descrs, $i = $_.Group.stig, $_.Group.descr, 0
$sigs.ForEach( { $stigs[$i], $descrs[$i++] -join '|' }) -join "`n"
}
} | Export-Csv -NoTypeInformation -Encoding utf8 .\output.csv
注:
• 需要 $stigs
和 $descrs
的 [array]
类型约束来处理组仅包含 一个 记录的情况,在这种情况下 $_.Group.sig
和 $_.Group.descr
,由于 member-access enumeration 的行为,return 只有 单个字符串 比 单个字符串的评分者元素数组;如果没有 [array]
转换,索引(例如 [$i]
)将在 [string]
个实例上执行,这将 return 该位置的单个字符 来自字符串.
• 在Export-Csv
调用中,根据需要调整-Encoding
。 BOM-less UTF-8 现在是 PowerShell (Core) 7+ 中的默认设置,并且不再需要 -NoTypeInformation
。
生成的文件具有以下内容,显示了列内部换行符的使用(受 "..."
中包含的整个值的保护):
"cmmc","stig_descr"
"AC.1.001","SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities."
"AC.1.002","SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers."
为了可视化这会产生所需的数据,您可以重新导入结果文件并使用 -Wrap
开关将其通过管道传输到 Format-Table
:
PS> Import-Csv .\output.csv | Format-Table -Wrap
cmmc stig_descr
---- ---------
AC.1.001 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers.
请注意,-Wrap
尊重 属性-内部换行符,但如果它们对控制台而言太宽,则会将单行分成多行 window。
嘿,我想删除一些数据的重复项并合并 CSV 中的列。无法理解如何去做。这是我正在使用的数据示例:
cmmc,stig,descr
AC.1.001,SV-205663r569188_rule,The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
AC.1.001,SV-205667r569188_rule,Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002,SV-205663r569188_rule,The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
AC.1.002,SV-205665r569188_rule,Enterprise Domain Controllers groups on domain controllers.
我非常接近我正在寻找的数据,但很难在第二列中的项目之后添加 |<value of 'descr'>
:
这是我的脚本:
Import-CSV '.\input.csv' | Group-Object 'cmmc' |
ForEach-Object {
[PsCustomObject]@{
cmmc = $_.name
stig = $_.group.stig -Join '
'
}
} | Export-Csv '.\output.csv' -NoTypeInformation
输出如下所示(为了便于阅读而格式化,省略了列名):
AC1.001 SV-205663r569188_rule
SV-205665r569188_rule
AC1.002 SV-205663r569188_rule
SV-205665r569188_rule
但我正在寻找这个:
AC.1.001 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers.
使用下面的,它利用了 Select-Object
cmdlet applied to the results from your Group-Object
调用:
Import-Csv .\input.csv |
Group-Object cmmc |
Select-Object @{ Name = 'cmmc'; e = 'Name' },
@{ Name = 'stig_descr'; e = {
[array] $stigs, [array] $descrs, $i = $_.Group.stig, $_.Group.descr, 0
$sigs.ForEach( { $stigs[$i], $descrs[$i++] -join '|' }) -join "`n"
}
} | Export-Csv -NoTypeInformation -Encoding utf8 .\output.csv
注:
• 需要 $stigs
和 $descrs
的 [array]
类型约束来处理组仅包含 一个 记录的情况,在这种情况下 $_.Group.sig
和 $_.Group.descr
,由于 member-access enumeration 的行为,return 只有 单个字符串 比 单个字符串的评分者元素数组;如果没有 [array]
转换,索引(例如 [$i]
)将在 [string]
个实例上执行,这将 return 该位置的单个字符 来自字符串.
• 在Export-Csv
调用中,根据需要调整-Encoding
。 BOM-less UTF-8 现在是 PowerShell (Core) 7+ 中的默认设置,并且不再需要 -NoTypeInformation
。
生成的文件具有以下内容,显示了列内部换行符的使用(受 "..."
中包含的整个值的保护):
"cmmc","stig_descr"
"AC.1.001","SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities."
"AC.1.002","SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers."
为了可视化这会产生所需的数据,您可以重新导入结果文件并使用 -Wrap
开关将其通过管道传输到 Format-Table
:
PS> Import-Csv .\output.csv | Format-Table -Wrap
cmmc stig_descr
---- ---------
AC.1.001 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers.
请注意,-Wrap
尊重 属性-内部换行符,但如果它们对控制台而言太宽,则会将单行分成多行 window。