在 C# 中解析带有文字的文本文件

Parsing a text file with literals in C#

我在解析带有文字的文本文件时遇到问题。

我遇到问题的文字是:

"\(" which is an open bracket "(" and "/)" which is the close bracket ")"

这是我正在解析的文本文件的示例:

BT /F1 9 Tf 53.8646616541353 441 Td ( Voucher  AADA    Trans.      Prods               CDE                TRX                                   Payment) Tj ET
BT /F1 9 Tf 53.8646616541353 432 Td ( Number    Num    Date    WH   ID   Name     Name  #  Year  Inv. #    CD  Due Date    Qty   Price  Disct      %     Amount Due) Tj ET
BT /F1 9 Tf 53.8646616541353 423 Td (--------- ---- ---------- -- ------ -------- ----- -- ---- ---------- -- ---------- ------ ------- ------- ------- ------------) Tj ET
BT /F1 9 Tf 53.8646616541353 414 Td ( 21812539      09/30/2015 NA  29264 Symante  SUMME 52 2015 1735247    RM 09/30/2015      2  .00 50.0000  100.0%       15.00 ) Tj ET
BT /F1 9 Tf 53.8646616541353 405 Td ( 21827266      10/01/2015 NA  29264 Symante  SUMME 52 2015 1735966    RE 10/01/2015      1  .00 50.0000  100.0%       \(7.50\)) Tj ET
BT /F1 9 Tf 53.8646616541353 396 Td ( 21832628      10/02/2015 NA  29264 Symante  SUMME 52 2015 1736174    RM 10/02/2015      1  .00 50.0000  100.0%        7.50 ) Tj ET
BT /F1 9 Tf 53.8646616541353 387 Td ( 21838251      10/02/2015 NA  29264 Symante  SUMME 52 2015 1736429    RE 10/02/2015      1  .00 50.0000  100.0%       \(7.50\)) Tj ET
BT /F1 9 Tf 53.8646616541353 378 Td ( 21841821      10/03/2015 NA  29264 Symante  SUMME 52 2015 1736583    RM 10/03/2015      1  .00 50.0000  100.0%        7.50 ) Tj ET
BT /F1 9 Tf 53.8646616541353 369 Td ( 21874851      10/08/2015 NA  29264 Symante  SUMME 52 2015 1738192    RE 10/08/2015      1  .00 50.0000  100.0%       \(7.50\)) Tj ET
BT /F1 9 Tf 53.8646616541353 360 Td ( 21879328      10/09/2015 NA  29264 Symante  SUMME 52 2015 1738389    RM 10/09/2015      1  .00 50.0000  100.0%        7.50 ) Tj ET
BT /F1 9 Tf 53.8646616541353 351 Td ( 21933007      10/16/2015 NA  29264 Symante  SUMME 52 2015 0000531968 SK 10/16/2015      1  .00 50.0000  100.0%       \(7.50\)) Tj ET
BT /F1 9 Tf 53.8646616541353 342 Td (                                                                                                                  -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 333 Td (                                                                                           Sub Total:               \(,650.00\)) Tj ET
BT /F1 9 Tf 53.8646616541353 324 Td (                                                                                                                  -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 315 Td ( 21827466      10/02/2015 NA  57629                        0000531284 PO 10/02/2015      0                  100.0%    \(1500.00\)) Tj ET
BT /F1 9 Tf 53.8646616541353 306 Td (                                                                                                                  -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 297 Td (                                                                                           Sub Total:               \(,500.00\)) Tj ET
BT /F1 9 Tf 53.8646616541353 288 Td (                                                                                                                  -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 279 Td ( 21663952      09/02/2015 SN  57629 Zeal \(I\) 61-SE 61 2015 0000529704 IN 11/01/2015   2443  .95 50.0000  100.0%    11111.43 ) Tj ET
BT /F1 9 Tf 53.8646616541353 270 Td ( 21663953      09/02/2015 SN  57629 Zeal \(I\) 61-SE 61 2015 0000529704 SP 11/01/2015   2443  .95 50.0000  100.0%     \(200.33\)) Tj ET
BT /F1 9 Tf 53.8646616541353 261 Td ( 21699656      09/09/2015 S2  57629 Zeal \(I\) 61-SE 61 2015 0000530025 IN 11/08/2015    449  .95 50.0000  100.0%     1156.28 ) Tj ET
BT /F1 9 Tf 53.8646616541353 252 Td ( 21699657      09/09/2015 S2  57629 Zeal \(I\) 61-SE 61 2015 0000530025 SP 11/08/2015    449  .95 50.0000  100.0%      \(36.82\)) Tj ET
BT /F1 9 Tf 53.8646616541353 243 Td ( 21699658      09/09/2015 SL  57629 Zeal \(I\) 61-SE 61 2015 0000530025 IN 11/08/2015   1320  .95 50.0000  100.0%     1111.00 ) Tj ET
BT /F1 9 Tf 53.8646616541353 234 Td ( 21699659      09/09/2015 SL  57629 Zeal \(I\) 61-SE 61 2015 0000530025 SP 11/08/2015   1320  .95 50.0000  100.0%     \(108.24\)) Tj ET
BT /F1 9 Tf 53.8646616541353 225 Td ( 21736996      09/16/2015 S1  57629 Zeal \(I\) 61-SE 61 2015 0000530390 IN 11/15/2015   1016  .95 50.0000  100.0%     1111.60 ) Tj ET
BT /F1 9 Tf 53.8646616541353 216 Td ( 21736997      09/16/2015 S1  57629 Zeal \(I\) 61-SE 61 2015 0000530390 SP 11/15/2015   1016  .95 50.0000  100.0%      \(83.31\)) Tj ET
BT /F1 9 Tf 53.8646616541353 207 Td ( 21808378      09/29/2015 NA  57629 Zeal \(I\) 61-SE 61 2015 1735086    RE 09/29/2015      8  .95 50.0000  100.0%      \(59.80\)) Tj ET
BT /F1 9 Tf 53.8646616541353 198 Td ( 21838252      10/02/2015 NA  57629 Zeal \(I\) 61-SE 61 2015 1736429    RE 10/02/2015      1  .95 50.0000  100.0%       \(7.48\)) Tj ET
BT /F1 9 Tf 53.8646616541353 189 Td ( 21874852      10/08/2015 NA  57629 Zeal \(I\) 61-SE 61 2015 1738192    RE 10/08/2015      4  .95 50.0000  100.0%      \(29.90\)) Tj ET
BT /F1 9 Tf 53.8646616541353 180 Td (  

如果您查看第 20 行,产品名称是 Zeal (I)。负数(最后一列应付金额)也用方括号括起来。

我正在逐行解析文本文件,但是,当我尝试

line.Replace(@"\(", "");

这似乎行不通。我以前从未在文件中遇到过这些文字,所以我不确定如何处理。除了这个,我几乎完成了解析。

我这样做的方式非常简单

                string line;
                int count = 0; // to be removed. Used in testing to cap count.
                while ((line = reader.ReadLine()) != null)
                {
                    if (count <= 10)
                    {
                        if (line.Length > 170 && line.Length < 200)
                        {
                            if (!ContainsAny(line))
                            {

                                line.Replace(@"\(", "");

                                indexStart = line.IndexOf("Td (") + 4;

                                col0 = line.Substring(indexStart, 9);
                                col1 = line.Substring(indexStart + 10, 4);
                                col2 = line.Substring(indexStart + 15, 10);
                                col3 = line.Substring(indexStart + 26, 2);
                                col4 = line.Substring(indexStart + 29, 6);
                                col5 = line.Substring(indexStart + 36, 8);
                                col6 = line.Substring(indexStart + 45, 5);
                                col7 = line.Substring(indexStart + 51, 2);
                                col8 = line.Substring(indexStart + 54, 4);
                                col9 = line.Substring(indexStart + 59, 10);
                                col10 = line.Substring(indexStart + 70, 2);
                                col11 = line.Substring(indexStart + 73, 10);
                                col12 = line.Substring(indexStart + 84, 6);
                                col13 = line.Substring(indexStart + 91, 7).Replace("$", "");
                                col14 = line.Substring(indexStart + 99, 7);
                                col15 = line.Substring(indexStart + 107, 7).Replace("%", "");
                                col16 = line.Substring(indexStart + 115, 12);

                                MessageBox.Show(string.Format("{0}; {1}; {2}; {3}; {4}; {5}; {6}; {7}; {8}; {9}; {10}; {11}; {12}; {13}; {14}; {15}; {16};", col0, col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16));


                                //writer.WriteLine(lineOut);


                                count += 1; // to be removed. Used in testing to cap count.
                            }
                        }
                    }

我写入文件时得到的结果是

21841821             10/03/2015  NA 29264   Symante  SUMME  52  2015     1736583     RM  10/03/2015 1   15  50  100 7.5
21874851             10/08/2015  NA 29264   Symante  SUMME  52  2015    1738192  RE  10/08/2015 1   15  50  100 -7.5
21879328             10/09/2015  NA 29264   Symante  SUMME  52  2015    1738389  RM  10/09/2015 1   15  50  100 7.5
21933007             10/16/2015  NA 29264   Symante  SUMME  52  2015    531968   SK  10/16/2015 1   15  50  100 -7.5
21827466             10/02/2015  NA 57629                                   531284   PO  10/02/2015 0                           100 -4500
21663952             09/02/2015  SN 57629    Zeal \(I    ) 61-   E   1 20    5 00005297 4    N 11/01/20  5   24  3  14.  5 50.00     0  100.    18261.40%
21663953             09/02/2015  SN 57629    Zeal \(I    ) 61-   E   1 20    5 00005297 4    P 11/01/20  5   24  3  14.  5 50.00     0  100.    -200.00%
21699656             09/09/2015  S2 57629    Zeal \(I    ) 61-   E   1 20    5 00005300 5    N 11/08/20  5    4  9  14.  5 50.00     0  100.    3356.20%

line.Replace(@"\(", ""); 不修改 string。它只是 return 新更改 string。你应该写:

line = line.Replace(@"\(", "");

检查 String.Replace 的文档:

Returns a new string in which all occurrences of a specified string in the current instance are replaced with another specified string.

您需要使用:

line=line.Replace(@"\(", "");

看起来你写的比实际需要的太多了。

        var allLines = File.ReadAllLines(@"C:\myfile.text");
        var correctedLines = allLines.Select(l => l.Replace(@"\(", "").Replace(@"\)", ""));
        //now use corrected lines in your code