为什么 [20, ..., 13, 14].min(2) => [13, 20]?
Why [20, ..., 13, 14].min(2) => [13, 20]?
[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2)
# => [13, 20]
为什么不是 [13, 14]
?以及如何 do 我得到我想要的,两个最小的元素(线性时间)?
The doc's sentence "If the n argument is given, minimum n elements are returned as an array" isn't quite clear to me, but I think it says min(2)
should give me the smallest two elements. I couldn't find much about it, but this thread,这可能是起源,似乎同意我的观点,并说它应该 return 与 sort.first(n)
相同,但事实并非如此:
[20, 32, 32, 21, 30, 25, 29, 13, 14].sort.first(2)
# => [13, 14]
抱歉,这个愚蠢的问题,也很抱歉 "large" 的例子,但这已经减少了 - 再删除一个数字(13 或 14 除外)确实给我 [13, 14]
。
我刚刚在 Ruby Issue Tracking System:
中发布了对错误的解释
I suppose I found the problem. Taking the first example:
[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2)
This will call the function "nmin_run" in the file "enum.c", which
sets "bufmax" to 4 times the number of minimums (n) we want (for the
example, bufmax is 8), and then in the line
1327 will call
the function "nmin_i" for each element of the original array.
In the function "nmin_i", when the buffer is full ("data->curlen ==
data->bufmax"), the function "nmin_filter" is called. In the example,
that happens when curlen is 8, and so the buffer is [20, 32, 32, 21,
30, 25, 29, 13]. The "nmin_filter" will do a quicksort until the n
smallest elements so far are on the leftmost part of the buffer, and
will discard the rest of the elements, which leaves us with [20, 13]
in the buffer.
And now starts the problem. At the end of "nmin_filter" the limit
(apparently with the intention of storing the greatest value in the
buffer) is set to the last value in the buffer (in the example, 13),
which is not true. And then based on that value "nmin_i" will discard
all remaining elements greater than that (in the example, discarding
the 14). The buffer is then sorted and it returns:
[13, 20]
So the solution is either remove all the limit-related part, or take
the last pivot as the limit.
顺便回答你的问题...
And how do I get what I want, the two smallest elements (in linear time)?
如果此方法不存在或同时被破坏,您可以使用 Quickselect select 线性时间中的两个最小元素,这基本上是 Ruby 在 min
引擎盖下。
这是我从维基百科直接翻译的:
class Array
def mymin(n)
return self.sort if self.size <= n
a = self.dup
left = 0
right = a.size - 1
loop do
pivot_index = left + (right - left) / 2;
pivot_value = a[pivot_index]
a[pivot_index], a[right] = a[right], a[pivot_index]
store_index = left
left.upto(right - 1).each do |i|
if a[i] < pivot_value
a[store_index], a[i] = a[i], a[store_index]
store_index += 1
end
end
a[right], a[store_index] = a[store_index], a[right]
if n - 1 == store_index
break
elsif n - 1 < store_index
right = store_index - 1
else
left = store_index + 1
end
end
a.take(n).sort
end
end
然后我们试试你的例子:
[20, 32, 32, 21, 30, 25, 29, 13, 14].mymin(2)
# => [13, 14]
耶!我们刚刚修复了 min
。但请注意,此实现的 space 复杂度与原始数组的大小呈线性关系,而 Ruby 实现与值 n
呈线性关系。另外,如果你的原始数组有太多重复项,这会导致性能不佳,你应该寻找
3路分区。
如果你只想要 n = 2 的 min
并且真的很担心性能,可以为这种情况制作一个优化版本,保证 O(L)
(假设 L
是数组的长度)。
class Array
def min2
m1 = nil
m2 = nil
self.each do |x|
if m1.nil? || x < m1
m2 = m1
m1 = x
elsif m2.nil? || x < m2
m2 = x
end
end
[m1, m2].compact
end
end
并以类似的方式使用它:
[20, 32, 32, 21, 30, 25, 29, 13, 14].min2
# => [13, 14]
[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2)
# => [13, 20]
为什么不是 [13, 14]
?以及如何 do 我得到我想要的,两个最小的元素(线性时间)?
The doc's sentence "If the n argument is given, minimum n elements are returned as an array" isn't quite clear to me, but I think it says min(2)
should give me the smallest two elements. I couldn't find much about it, but this thread,这可能是起源,似乎同意我的观点,并说它应该 return 与 sort.first(n)
相同,但事实并非如此:
[20, 32, 32, 21, 30, 25, 29, 13, 14].sort.first(2)
# => [13, 14]
抱歉,这个愚蠢的问题,也很抱歉 "large" 的例子,但这已经减少了 - 再删除一个数字(13 或 14 除外)确实给我 [13, 14]
。
我刚刚在 Ruby Issue Tracking System:
中发布了对错误的解释I suppose I found the problem. Taking the first example:
[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2)
This will call the function "nmin_run" in the file "enum.c", which sets "bufmax" to 4 times the number of minimums (n) we want (for the example, bufmax is 8), and then in the line 1327 will call the function "nmin_i" for each element of the original array.
In the function "nmin_i", when the buffer is full ("data->curlen == data->bufmax"), the function "nmin_filter" is called. In the example, that happens when curlen is 8, and so the buffer is [20, 32, 32, 21, 30, 25, 29, 13]. The "nmin_filter" will do a quicksort until the n smallest elements so far are on the leftmost part of the buffer, and will discard the rest of the elements, which leaves us with [20, 13] in the buffer.
And now starts the problem. At the end of "nmin_filter" the limit (apparently with the intention of storing the greatest value in the buffer) is set to the last value in the buffer (in the example, 13), which is not true. And then based on that value "nmin_i" will discard all remaining elements greater than that (in the example, discarding the 14). The buffer is then sorted and it returns:
[13, 20]
So the solution is either remove all the limit-related part, or take the last pivot as the limit.
顺便回答你的问题...
And how do I get what I want, the two smallest elements (in linear time)?
如果此方法不存在或同时被破坏,您可以使用 Quickselect select 线性时间中的两个最小元素,这基本上是 Ruby 在 min
引擎盖下。
这是我从维基百科直接翻译的:
class Array
def mymin(n)
return self.sort if self.size <= n
a = self.dup
left = 0
right = a.size - 1
loop do
pivot_index = left + (right - left) / 2;
pivot_value = a[pivot_index]
a[pivot_index], a[right] = a[right], a[pivot_index]
store_index = left
left.upto(right - 1).each do |i|
if a[i] < pivot_value
a[store_index], a[i] = a[i], a[store_index]
store_index += 1
end
end
a[right], a[store_index] = a[store_index], a[right]
if n - 1 == store_index
break
elsif n - 1 < store_index
right = store_index - 1
else
left = store_index + 1
end
end
a.take(n).sort
end
end
然后我们试试你的例子:
[20, 32, 32, 21, 30, 25, 29, 13, 14].mymin(2)
# => [13, 14]
耶!我们刚刚修复了 min
。但请注意,此实现的 space 复杂度与原始数组的大小呈线性关系,而 Ruby 实现与值 n
呈线性关系。另外,如果你的原始数组有太多重复项,这会导致性能不佳,你应该寻找
3路分区。
如果你只想要 n = 2 的 min
并且真的很担心性能,可以为这种情况制作一个优化版本,保证 O(L)
(假设 L
是数组的长度)。
class Array
def min2
m1 = nil
m2 = nil
self.each do |x|
if m1.nil? || x < m1
m2 = m1
m1 = x
elsif m2.nil? || x < m2
m2 = x
end
end
[m1, m2].compact
end
end
并以类似的方式使用它:
[20, 32, 32, 21, 30, 25, 29, 13, 14].min2
# => [13, 14]