用于查找文档中字符出现次数的伪代码
Pseudo code to find number of occurrence of characters in a documents
我正在尝试为 MapReduce 技术编写伪代码,我需要在其中查找文档中字符的出现次数。例如:
m
:1000次,M
:5000次,“</code>”:3000次,<code>\n
:100次,.
:20000次等
有人可以告诉我这是否正确,或者我可以改进一下吗?
我已经编写了如下所示的伪代码:
def Map(documentName, documentContent)
For Character in documentContent
EmitIntermediate(Character, 1)
def Reduce(Character, Counts)
Char_Count = 0
For count in Counts
Char_Count += count
Emit(Character,Char_Count)
我参考了一些在线可用的 map-reduce 技术的伪代码并编写了这个。
例如,他们习惯使用以下伪代码来查找单词在文档中出现的次数:
def map(documentName, documentContent):
for line in documentContent:
words = line.split(" ")
for word in words:
EmitIntermediate(word, 1)
def reduce(word, counts):
wordCount = 0
for count in counts:
wordCount += count
Emit(word, wordCount)
def Map(documentName, documentContent)
For line in documentContent
Line_String = line
For Charcter in Line_String
EmitIntermediate(Character, 1)
def Reduce(Character, Counts)
Char_Count = 0
For count in Counts
Char_Count += count
Emit(Character,Char_Count)
我正在尝试为 MapReduce 技术编写伪代码,我需要在其中查找文档中字符的出现次数。例如:
m
:1000次,M
:5000次,“</code>”:3000次,<code>\n
:100次,.
:20000次等
有人可以告诉我这是否正确,或者我可以改进一下吗?
我已经编写了如下所示的伪代码:
def Map(documentName, documentContent)
For Character in documentContent
EmitIntermediate(Character, 1)
def Reduce(Character, Counts)
Char_Count = 0
For count in Counts
Char_Count += count
Emit(Character,Char_Count)
我参考了一些在线可用的 map-reduce 技术的伪代码并编写了这个。 例如,他们习惯使用以下伪代码来查找单词在文档中出现的次数:
def map(documentName, documentContent):
for line in documentContent:
words = line.split(" ")
for word in words:
EmitIntermediate(word, 1)
def reduce(word, counts):
wordCount = 0
for count in counts:
wordCount += count
Emit(word, wordCount)
def Map(documentName, documentContent)
For line in documentContent
Line_String = line
For Charcter in Line_String
EmitIntermediate(Character, 1)
def Reduce(Character, Counts)
Char_Count = 0
For count in Counts
Char_Count += count
Emit(Character,Char_Count)