用于创建单词索引的字典的 C# 实现
C# implementation of Dictionary to Create an index of words
我目前正在尝试创建一个应用程序来进行一些文本处理以读取文本文件,然后我使用字典创建单词索引,读取文本文件并检查单词是否已经存在是否在该文件中。如果是,它将打印出索引号并继续检查。
我已经尝试实现一些代码来创建字典。我使用的代码如下:
private void bagofword_Click(object sender, EventArgs e)
{ //creating dictionary in background
Dictionary<string, int> dict = new Dictionary<string, int>();
string rawinputbow = File.ReadAllText(textBox31.Text);
string[] inputbow = rawinputbow.Split(' ');
foreach (String word in inputbow)
{
if (dict.ContainsKey(word))
{
dict[word]++;
}
else
{
dict[word] = 1;
}
}
var ordered = from k in dict.Keys
orderby dict[k] descending
select k;
using (StreamWriter output = new StreamWriter("D:\output.txt"))
{
foreach (String k in ordered)
{
output.WriteLine(String.Format("{0}: {1}", k, dict[k]));
}
output.Close();
}
}
这是我输入的文本文件的示例:http://pastebin.com/ZRVbhWhV
快速 ctrl-F
表明 "not" 出现了 2 次,"that" 出现了 4 次。我需要做的是索引每个单词并像这样调用它:
sample input : "that I have not had not that place"
dictionary : output.txt:
index word 5
1 I 1
2 have 2
3 had 4
4 not 3
5 that 4
6 place 5
6
有人知道如何完成该代码吗?非常感谢任何帮助!
尝试这样的事情:
void Main()
{
var txt = "that i have not had not that place"
.Split(" ".ToCharArray(),StringSplitOptions.RemoveEmptyEntries)
.ToList();
var dict = new OrderedDictionary();
var output = new List<int>();
foreach (var element in txt.Select ((word,index) => new{word,index}))
{
if (dict.Contains(element.word))
{
var count = (int)dict[element.word];
dict[element.word] = ++count;
output.Add(GetIndex(dict,element.word));
}
else
{
dict[element.word] = 1;
output.Add(GetIndex(dict,element.word));
}
}
}
public int GetIndex(OrderedDictionary dictionary, string key)
{
for (int index = 0; index < dictionary.Count; index++)
{
if (dictionary[index] == dictionary[key])
return index; // We found the item
}
return -1;
}
结果:
dict = (6 项)
that = 2
i = 1
have = 1
not = 2
had = 1
place = 1
输出 =(8 项)
0
1
2
3
4
3
0
5
我目前正在尝试创建一个应用程序来进行一些文本处理以读取文本文件,然后我使用字典创建单词索引,读取文本文件并检查单词是否已经存在是否在该文件中。如果是,它将打印出索引号并继续检查。
我已经尝试实现一些代码来创建字典。我使用的代码如下:
private void bagofword_Click(object sender, EventArgs e)
{ //creating dictionary in background
Dictionary<string, int> dict = new Dictionary<string, int>();
string rawinputbow = File.ReadAllText(textBox31.Text);
string[] inputbow = rawinputbow.Split(' ');
foreach (String word in inputbow)
{
if (dict.ContainsKey(word))
{
dict[word]++;
}
else
{
dict[word] = 1;
}
}
var ordered = from k in dict.Keys
orderby dict[k] descending
select k;
using (StreamWriter output = new StreamWriter("D:\output.txt"))
{
foreach (String k in ordered)
{
output.WriteLine(String.Format("{0}: {1}", k, dict[k]));
}
output.Close();
}
}
这是我输入的文本文件的示例:http://pastebin.com/ZRVbhWhV
快速 ctrl-F
表明 "not" 出现了 2 次,"that" 出现了 4 次。我需要做的是索引每个单词并像这样调用它:
sample input : "that I have not had not that place" dictionary : output.txt: index word 5 1 I 1 2 have 2 3 had 4 4 not 3 5 that 4 6 place 5 6
有人知道如何完成该代码吗?非常感谢任何帮助!
尝试这样的事情:
void Main()
{
var txt = "that i have not had not that place"
.Split(" ".ToCharArray(),StringSplitOptions.RemoveEmptyEntries)
.ToList();
var dict = new OrderedDictionary();
var output = new List<int>();
foreach (var element in txt.Select ((word,index) => new{word,index}))
{
if (dict.Contains(element.word))
{
var count = (int)dict[element.word];
dict[element.word] = ++count;
output.Add(GetIndex(dict,element.word));
}
else
{
dict[element.word] = 1;
output.Add(GetIndex(dict,element.word));
}
}
}
public int GetIndex(OrderedDictionary dictionary, string key)
{
for (int index = 0; index < dictionary.Count; index++)
{
if (dictionary[index] == dictionary[key])
return index; // We found the item
}
return -1;
}
结果:
dict = (6 项)
that = 2
i = 1
have = 1
not = 2
had = 1
place = 1
输出 =(8 项)
0
1
2
3
4
3
0
5