返回由逗号分隔的相同变量的函数
Function returning same variable separated by a comma
我不明白这个函数的意义 returning 两个相同的变量:
def construct_shingles(doc,k,h):
#print 'antes -> ',doc,len(doc)
doc = doc.lower()
doc = ''.join(doc.split(' '))
#print 'depois -> ',doc,len(doc)
shingles = {}
for i in xrange(len(doc)):
substr = ''.join(doc[i:i+k])
if len(substr) == k and substr not in shingles:
shingles[substr] = 1
if not h:
return doc,shingles.keys()
ret = tuple(shingles_hashed(shingles))
return ret,ret
看似多余,但一定有充分的理由,我只是不明白为什么。也许是因为有两个 return 语句?如果 'h' 为真,是否 return 两个 return 语句?调用函数如下所示:
def construct_set_shingles(docs,k,h=False):
shingles = []
for i in xrange(len(docs)):
doc = docs[i]
doc,sh = construct_shingles(doc,k,h)
docs[i] = doc
shingles.append(sh)
return docs,shingles
和
def shingles_hashed(shingles):
global len_buckets
global hash_table
shingles_hashed = []
for substr in shingles:
key = hash(substr)
shingles_hashed.append(key)
hash_table[key].append(substr)
return shingles_hashed
数据集和函数调用如下:
k = 3 #number of shingles
d0 = "i know you"
d1 = "i think i met you"
d2 = "i did that"
d3 = "i did it"
d4 = "she says she knows you"
d5 = "know you personally"
d6 = "i think i know you"
d7 = "i know you personally"
docs = [d0,d1,d2,d3,d4,d5,d6,d7]
docsChange,shingles = construct_set_shingles(docs[:],k)
github位置:lsh/LHS
您的猜测是正确的,关于为什么 return ret,ret
,答案是 return 语句是为了 return 一对相等的值而不是一个。
它更像是一种编码风格而不是算法,因为这可以通过其他语法来完成。然而,这在某些情况下是有利的,例如如果我们写
def func(x, y, z):
...
return ret
a = func(x, y, z)
b = func(x, y, z)
那么func
会被执行两次。但是如果:
def func(x, y, z):
...
return ret, ret
a, b = func(x, y, z)
然后 func
只能执行一次,同时能够 return 到 a
和 b
同样在您的特定情况下:
如果h
是false
则程序until执行到return doc,shingles.keys()
行,然后doc
和sh
中的变量[=22] =]分别取值doc
和shingles.keys()
。
否则,省略第一个return语句,执行第二个语句,然后doc
和sh
取相等的值,特别是等于[=27的值=]
我不明白这个函数的意义 returning 两个相同的变量:
def construct_shingles(doc,k,h):
#print 'antes -> ',doc,len(doc)
doc = doc.lower()
doc = ''.join(doc.split(' '))
#print 'depois -> ',doc,len(doc)
shingles = {}
for i in xrange(len(doc)):
substr = ''.join(doc[i:i+k])
if len(substr) == k and substr not in shingles:
shingles[substr] = 1
if not h:
return doc,shingles.keys()
ret = tuple(shingles_hashed(shingles))
return ret,ret
看似多余,但一定有充分的理由,我只是不明白为什么。也许是因为有两个 return 语句?如果 'h' 为真,是否 return 两个 return 语句?调用函数如下所示:
def construct_set_shingles(docs,k,h=False):
shingles = []
for i in xrange(len(docs)):
doc = docs[i]
doc,sh = construct_shingles(doc,k,h)
docs[i] = doc
shingles.append(sh)
return docs,shingles
和
def shingles_hashed(shingles):
global len_buckets
global hash_table
shingles_hashed = []
for substr in shingles:
key = hash(substr)
shingles_hashed.append(key)
hash_table[key].append(substr)
return shingles_hashed
数据集和函数调用如下:
k = 3 #number of shingles
d0 = "i know you"
d1 = "i think i met you"
d2 = "i did that"
d3 = "i did it"
d4 = "she says she knows you"
d5 = "know you personally"
d6 = "i think i know you"
d7 = "i know you personally"
docs = [d0,d1,d2,d3,d4,d5,d6,d7]
docsChange,shingles = construct_set_shingles(docs[:],k)
github位置:lsh/LHS
您的猜测是正确的,关于为什么 return ret,ret
,答案是 return 语句是为了 return 一对相等的值而不是一个。
它更像是一种编码风格而不是算法,因为这可以通过其他语法来完成。然而,这在某些情况下是有利的,例如如果我们写
def func(x, y, z):
...
return ret
a = func(x, y, z)
b = func(x, y, z)
那么func
会被执行两次。但是如果:
def func(x, y, z):
...
return ret, ret
a, b = func(x, y, z)
然后 func
只能执行一次,同时能够 return 到 a
和 b
同样在您的特定情况下:
如果h
是false
则程序until执行到return doc,shingles.keys()
行,然后doc
和sh
中的变量[=22] =]分别取值doc
和shingles.keys()
。
否则,省略第一个return语句,执行第二个语句,然后doc
和sh
取相等的值,特别是等于[=27的值=]