字符串拆分为指定字符串,中间没有分隔符,文本在中间

String split with specified string without delimiter with text in middle

这是 post -

的延续

用例 #1:当 searchedText 开始/或结束时 (watch) ,如果 fragments 值为空,我用 searchText 替换并且有效

string watch = "Arrests as cops bust 0m money-laundering gang";
string searchedText = "Arrests as cops bust 0m";
string[] fragments = watch.Split(new string[] { searchedText }, StringSplitOptions.None);

用例 #2:当 searchedText 在 (watch) 之间时,如何在下面的代码中处理这种情况?

//This loop will execute only two times because it can have maximum 2 values, issue will
 //come when searched value is in middle (loop should run 3 times) as for the searched value I have to apply different logic (like change background color of the text)
 // and don't change background color for head and tail
 // How do I insert searched value in middle of [0] and [1] ??

 string watch = "Arrests as cops bust 0m money-laundering gang";
 string searchedText = "cops bust";

完整代码:

foreach (SharedStringItem sharedString in sharedStrings)
{
    string innerText = sharedString.InnerText; // This contains complete line (watch)

    if (innerText.IndexOf(searchText, StringComparison.OrdinalIgnoreCase) >= 0)
    {
        sharedString.RemoveAllChildren(); // Remove complete line from spreadsheet because we have to make it again as searched text needs to be highlighted 
        // Split the line so it will give blank for searched text and remaining line 
        string[] fragments = innerText.Split(new string[] { searchText }, StringSplitOptions.None);

        // loop through both words/line
        foreach (var item in fragments)
        {
             DocumentFormat.OpenXml.Spreadsheet.Text text = null;

             // If item is blank append the search text else append the remaining line /word
             if(string.IsNullOrEmpty(item))
                 text = new DocumentFormat.OpenXml.Spreadsheet.Text((item != "" ? " " : String.Empty) + searchText);
             else
                 text = new DocumentFormat.OpenXml.Spreadsheet.Text((item != "" ? " " : String.Empty) + item);

             text.Space = SpaceProcessingModeValues.Preserve;

             // New Run needs to be created for each splitted line/word, run is like a row in spreadsheet
             // You cannot create a single run because you need to take care of searched text as it needs to be highlighted before adding to the row
             Run run = new Run();
             run.Append(text);

             // This code should only be executed for searched text
             if (searchText.Equals(text.InnerText, StringComparison.Ordinal))
             {
                 if (run.RunProperties == null)
                     run.RunProperties = new RunProperties();

                 run.RunProperties.Append(new Color { Rgb = "008000" });
                 run.RunProperties.Append(new DocumentFormat.OpenXml.Spreadsheet.Bold());

             }

             // This line add individual run (Example -> Arrests as + <highlight searched text> + remaining text
            sharedString.Append(run);
        }
    }
}


Case : It does not work

seachedText = merrylands
watch = "httdailytelegraph.com.au/newslocal/parramatta/trio-charged-over-alleged-100m-money-laundering-syndicate-at-merrylands-guildford-west/news-story/92ba3163ce58ad8b49989131fa7a5d8e"

更新: 你可以试试这个

        string text = "Trio charged over alleged 0m money laundering syndicate at Merrylands, Guildford West";
        string searchtext = "charged over";
        searchtextPattern =  "(?=" + Regex.Escape(searchtext) + ")";

        string[] fragments= Regex.Split(text, searchtextPattern);
        //fargments will have two elements here
        // fragments[0] - "Trio"
        // fragments[1] - "charged over alleged 0m money laundering syndicate at Merrylands, Guildford West"

现在您可以再次拆分具有搜索文本的片段,即本例中的片段1。 请参阅下面的代码

            var stringWithoutSearchText = fragments[1].Replace(searchtext, string.Empty);

您需要检查每个片段是否包含搜索文本。您可以在片段上执行 foreach 循环。在下面添加检查

     foreach (var item in fragments)
     { 
        if (item.Contains(searchtext))
        { 
          string stringWithoutSearchText = item.Replace(searchtext, string.Empty);
        }
     }

我试图将其融入您的代码。你可以试试这样的

foreach (SharedStringItem sharedString in sharedStrings)
        {
            string innerText = sharedString.InnerText; // This contains complete line (watch)

            if (innerText.IndexOf(searchText, StringComparison.OrdinalIgnoreCase) >= 0)
            {
                sharedString.RemoveAllChildren(); // Remove complete line from spreadsheet because we have to make it again as searched text needs to be highlighted 
                                                  // Split the line so it will give blank for searched text and remaining line 

                var searchtextPattern = "(?=" + Regex.Escape(searchText) + ")";

                string[] fragments = Regex.Split(innerText, searchtextPattern);

                // loop through both words/line
                foreach (var item in fragments)
                {
                 if (!string.IsNullOrEmpty(item))
                    {

                        //It will check whether the item contains search string or not 

                        if (item.Contains(searchtext))
                        {
                            // now GetRun() method called two times here

                            string stringWithoutSearchText = item.Replace(searchtext, string.Empty);
                            // in your example method argument will be  "charged over"
                            var run = GetRun(new DocumentFormat.OpenXml.Spreadsheet.Text(" " + searchtext));
                            //this code will only execute for search text
                            if (run.RunProperties == null)
                                run.RunProperties = new RunProperties();

                            run.RunProperties.Append(new Color { Rgb = "008000" });
                            run.RunProperties.Append(new DocumentFormat.OpenXml.Spreadsheet.Bold());

                            sharedString.Append(run);
                            // in your example method argument will be  "alleged 0m money laundering syndicate at Merrylands, Guildford West"
                            if (!string.IsNullOrEmpty(stringWithoutSearchText))
                                sharedString.Append(GetRun(new DocumentFormat.OpenXml.Spreadsheet.Text(" " + stringWithoutSearchText)));
                        }
                        else
                        {
                            //in your example method argument "will be Trio"
                            sharedString.Append(GetRun(new DocumentFormat.OpenXml.Spreadsheet.Text(" " + item)));
                        }
                    }
                }
            }
        }

你的 GetRun 方法将是这样的

 private Run GetRun(DocumentFormat.OpenXml.Spreadsheet.Text text)
    {
        text.Space = SpaceProcessingModeValues.Preserve;

        // New Run needs to be created for each splitted line/word, run is like a row in spreadsheet
        // You cannot create a single run because you need to take care of searched text as it needs to be highlighted before adding to the row
        Run run = new Run();
        run.Append(text);
        return run;
    }

案例 2:

//if search text is at end
string watch = "Bitcoin ATMs Highlight Flaws in EU Money Laundering Rules";
string searchtext = "Money Laundering Rules";
//fragment of above string by using Regex.Split will be like 
// fragments[0] - "Bitcoin ATMs Highlight Flaws in EU"
// fragments[1] - "Money Laundering Rules"

案例 3:

//if search text is at start
string watch = "Money Laundering Rules Bitcoin ATMs Highlight Flaws in EU";
string searchtext = "Money Laundering Rules";
//fragment of above string by using Regex.Split will be like 
// fragments[0] - ""
// fragments[1] - "Money Laundering Rules Bitcoin ATMs Highlight Flaws in EU"

检查上面代码中的这三种情况

参考: