xmlstarlet 从 xml 数据提要中删除除一个元素之外的所有元素

xmlstarlet delete all elements except one from xml data feed

在我的 Debian VPS 上,我只想保留元素 CategoryName Mobile Phones 并删除所有其他具有类别名称的元素,例如 Mobile Accessories Laptops 等。总共 20 个不同类别名称。 XML 文件大小很大 800 MB。

xmlstarlet el -u sd.xml
Products
Products/Product
Products/Product/Brand
Products/Product/CategoryName
Products/Product/CategoryPathAsString

此处示例 XML :

<Products>
<Product>
   <ProductID>92545172</ProductID>
   <ProductSKU>630348288360</ProductSKU>
   <ProductName>Self Snap Aux Connected Selfie Stick</ProductName>
   <ProductDescription>This product is charge free </ProductDescription>
   <ProductPrice>353.00</ProductPrice>
   <ProductPriceCurrency>INR</ProductPriceCurrency>
   <WasPrice>649.00</WasPrice>
   <DiscountedPrice>0.00</DiscountedPrice>
   <ProductURL>http://clk</ProductURL>
   <PID>8053</PID>
   <MID>159526</MID>
   <ProductImageLargeURL>http://</ProductImageLargeURL>
   <StockAvailability>in stock</StockAvailability>
   <Brand>Self Snap</Brand>
   <CategoryName>Camera Accessories</CategoryName>
   <CategoryPathAsString>Root|Cameras &amp; Accessories|Camera Accessories|</CategoryPathAsString>
</Product>
<Product>
   <ProductID>29911116</ProductID>
   <ProductSKU>647266238</ProductSKU>
   <ProductName>Philips 40PFL5059/V7 40 inches Full HD LED Television</ProductName>
   <ProductDescription>LED Display Resolution : 1920 x 1080</ProductDescription>
   <ProductPrice>30196.00</ProductPrice>
   <ProductPriceCurrency>INR</ProductPriceCurrency>
   <WasPrice>39800.00</WasPrice>
   <DiscountedPrice>0.00</DiscountedPrice>
   <ProductURL>http://clk</ProductURL>
   <PID>8053</PID>
   <MID>159526</MID>
   <ProductImageLargeURL>http://n1</ProductImageLargeURL>
   <StockAvailability>in stock</StockAvailability>
   <Brand>Philips</Brand>
   <CategoryName>Televisions</CategoryName>
   <CategoryPathAsString>Root|TVs, Audio &amp; Video|Televisions|</CategoryPathAsString>
</Product>
<Product>
   <ProductID>93959216</ProductID>
   <ProductSKU>683203029</ProductSKU>
   <ProductName>Micromax Canvas Beat A114R</ProductName>
   <ProductDescription>Type : MultiSim Sim : Dual SIM Os Version : Android </ProductDescription>
   <ProductPrice>7999.00</ProductPrice>
   <ProductPriceCurrency>INR</ProductPriceCurrency>
   <WasPrice>9990.00</WasPrice>
   <DiscountedPrice>0.00</DiscountedPrice>
   <ProductURL>http://clk</ProductURL>
   <PID>8053</PID>
   <MID>159526</MID>
   <ProductImageLargeURL>http://n1</ProductImageLargeURL>
   <StockAvailability>in stock</StockAvailability>
   <Brand>Micromax</Brand>
   <CategoryName>Mobile Phones</CategoryName>
   <CategoryPathAsString>Root|Mobiles &amp; Tablets|Mobile Phones|</CategoryPathAsString>
</Product>
</Products>

没有样本 XML 和预期结果 XML 不清楚。假设您要删除名为 CategoryName 且内部文本不等于 "Mobile Phones" 的元素,您可以尝试使用此 xpath:

/Products/Product/CategoryName[. != 'Mobile Phones']

原来您想删除子元素 <CategoryName> 值不等于 "Mobile Phones"<Product> 元素。在这种情况下,您可以尝试以下 xpath :

/Products/Product[CategoryName != 'Mobile Phones']