为什么 8 个单元测试中有 1 个不接受?这是任务:
Why 1 of 8 Unit Tests is not accepting? Here is the task:
本单元测试
[TestCase(
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
"ipsum\namet\neuysmod\nlabore\naliqua",
ExpectedResult = "Lorem dolor sit , consectetur adipiscing elit, sed do eiusmod tempor incididunt ut et dolore magna .")]
public static void RemoveWordsFromContentAndWrite(StreamReader contentReader, StreamReader wordsReader, StreamWriter outputWriter)
{
// TODO #5-5. Implement the method by reading the content and words, removing words from the content, and writing the updated content to the outputWriter. Use StreamReader.Peek method for checking whether there are more characters in the underlying string.
var words = wordsReader.ReadToEnd();
var bufferSize = 100;
var buffer = new char[bufferSize];
var bytesCount = 0;
var currentWord = new StringBuilder();
while ((bytesCount = contentReader.ReadBlock(buffer, 0, bufferSize)) > 0)
{
for (var i = 0; i < bytesCount; i++)
{
if (char.IsLetterOrDigit(buffer[i]))
{
currentWord.Append(buffer[i]);
}
else
{
if (!words.Contains(currentWord.ToString()))
{
outputWriter.Write(currentWord);
}
outputWriter.Write(buffer[i]);
currentWord.Clear();
}
}
outputWriter.Flush();
}
if (!string.IsNullOrEmpty(currentWord.ToString()) && !words.Contains(currentWord.ToString()))
{
outputWriter.Write(currentWord);
outputWriter.Flush();
}
}
如果您打印出 words
的值并稍微思考一下,就会发现相当明显的逻辑错误。请记住,ReadToEnd
returns 流的内容作为 单个字符串 。
单元测试的结果表明,字符串中去掉了“et”这个词。您只是对 words
进行简单的子字符串检查,它确实包含彼此相邻的字符“et”(作为“amet[ 的一部分=38=]").
您可能想要做的是检查整个单词。有多种方法可以做到这一点,但一个微不足道的变化是:
var words = wordsReader.ReadToEnd().Split('\n');
String.Split
returns an array of strings. LINQ provides you with an IEnumerable<T>.Contains
方法执行您想要的 whole-word 搜索,所以如果您的代码中有 using Linq;
指令,那么您的方法的其余部分有效 as-is.
请注意,您的代码还有许多其他问题。这不是代码审查的地方,但要记住的重要一点是,你所谓的 bytes
实际上根本不是字节数,而是 UTF-16 characters 的计数,每个字符是两个字节宽。混淆字节和字符对您以后的代码来说不会有好下场!
本单元测试
[TestCase(
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
"ipsum\namet\neuysmod\nlabore\naliqua",
ExpectedResult = "Lorem dolor sit , consectetur adipiscing elit, sed do eiusmod tempor incididunt ut et dolore magna .")]
public static void RemoveWordsFromContentAndWrite(StreamReader contentReader, StreamReader wordsReader, StreamWriter outputWriter)
{
// TODO #5-5. Implement the method by reading the content and words, removing words from the content, and writing the updated content to the outputWriter. Use StreamReader.Peek method for checking whether there are more characters in the underlying string.
var words = wordsReader.ReadToEnd();
var bufferSize = 100;
var buffer = new char[bufferSize];
var bytesCount = 0;
var currentWord = new StringBuilder();
while ((bytesCount = contentReader.ReadBlock(buffer, 0, bufferSize)) > 0)
{
for (var i = 0; i < bytesCount; i++)
{
if (char.IsLetterOrDigit(buffer[i]))
{
currentWord.Append(buffer[i]);
}
else
{
if (!words.Contains(currentWord.ToString()))
{
outputWriter.Write(currentWord);
}
outputWriter.Write(buffer[i]);
currentWord.Clear();
}
}
outputWriter.Flush();
}
if (!string.IsNullOrEmpty(currentWord.ToString()) && !words.Contains(currentWord.ToString()))
{
outputWriter.Write(currentWord);
outputWriter.Flush();
}
}
如果您打印出 words
的值并稍微思考一下,就会发现相当明显的逻辑错误。请记住,ReadToEnd
returns 流的内容作为 单个字符串 。
单元测试的结果表明,字符串中去掉了“et”这个词。您只是对 words
进行简单的子字符串检查,它确实包含彼此相邻的字符“et”(作为“amet[ 的一部分=38=]").
您可能想要做的是检查整个单词。有多种方法可以做到这一点,但一个微不足道的变化是:
var words = wordsReader.ReadToEnd().Split('\n');
String.Split
returns an array of strings. LINQ provides you with an IEnumerable<T>.Contains
方法执行您想要的 whole-word 搜索,所以如果您的代码中有 using Linq;
指令,那么您的方法的其余部分有效 as-is.
请注意,您的代码还有许多其他问题。这不是代码审查的地方,但要记住的重要一点是,你所谓的 bytes
实际上根本不是字节数,而是 UTF-16 characters 的计数,每个字符是两个字节宽。混淆字节和字符对您以后的代码来说不会有好下场!