Groovy 的 trampoline() 使递归执行速度变慢 - 为什么?
Groovy's trampoline() makes recursive execution much slower - why?
我正在尝试递归:
def fac
//fac = { int curr, res = 1G -> 1 >= curr ? res : fac( curr - 1, res * curr ) }
fac = { int curr, res = 1G -> 1 >= curr ? res : fac.trampoline( curr - 1, res * curr ) }
fac = fac.trampoline()
def rnd = new Random()
long s = System.currentTimeMillis()
100000.times{ fac rnd.nextInt( 40 ) }
println "done in ${System.currentTimeMillis() - s} ms / ${fac(40)}"
如果我这样使用它,我会得到这个:
done in 691 ms
如果我取消注释行 #2 和注释行 #3-4 以删除 trampoline()
和 运行 它,我得到的数字会明显减少:
done in 335 ms
因此,使用 trampoline 时,递归速度会慢 2 倍。
我错过了什么?
P.S.
如果我 运行 Scala 2.12 中的相同示例:
def fac( curr:Int, acc:BigInt = 1 ):BigInt = if( 1 >= curr ) acc else fac( curr - 1, curr * acc )
val s = System.currentTimeMillis
for( ix <- 0 until 100000 ) fac( scala.util.Random.nextInt(40).toInt )
println( s"done in ${System.currentTimeMillis - s} ms" )
它执行得更快一点:
done in 178 ms
更新
将闭包重写为带有注解的方法:
@groovy.transform.TailRecursive
def fac( int curr, res = 1G ) { 1 >= curr ? res : fac( curr - 1, res * curr ) }
// the rest
给予
done in 164 ms
并且是超级coll。尽管如此,我还是想知道trampoline()
:)
如文档中所述,Closure.trampoline()
可防止调用堆栈溢出。
Recursive algorithms are often restricted by a physical limit: the maximum stack height. For example, if you call a method that recursively calls itself too deep, you will eventually receive a WhosebugException
.
An approach that helps in those situations is by using Closure
and its trampoline capability.
Closures are wrapped in a TrampolineClosure
. Upon calling, a trampolined Closure
will call the original Closure
waiting for its result. If the outcome of the call is another instance of a TrampolineClosure
, created perhaps as a result to a call to the trampoline()
method, the Closure will again be invoked. This repetitive invocation of returned trampolined Closures instances will continue until a value other than a trampolined Closure
is returned. That value will become the final result of the trampoline. That way, calls are made serially, rather than filling the stack.
但是,使用蹦床是有代价的。让我们看一下 JVisualVM 示例。
Non-trampoline 用例
运行 没有 trampoline()
的示例我们在 ~441 毫秒内得到结果
done in 441 ms / 815915283247897734345611269596115894272000000000
此执行分配了约 2,927,550 个对象并消耗了大约 100 MB 的内存。
CPU有点小事,除了花时间在main()
和run()
方法上,它在强制参数上花费了一些周期。
trampoline()
用例
蹦床的引入确实改变了很多。首先,与之前的尝试相比,它使执行时间几乎慢了两倍。
done in 856 ms / 815915283247897734345611269596115894272000000000
其次,它分配 ~5,931,470 (!!!) 对象并消耗~221 MB 内存。主要区别在于,在前一种情况下,所有执行都使用了一个 $_main_closure1
,而在使用蹦床的情况下 - 每次调用 trampoline()
方法都会创建:
- 一个新的
$_main_closure1
对象
- 用
CurriedClosure<T>
包裹
- 然后用
TrampolineClosure<T>
包裹起来
仅此分配了超过 1,200,000 个对象。
如果说到CPU,它还有更多的事情要做。看看数字:
- 对
TrampolineClosure<T>.<init>()
的所有调用消耗 199 毫秒
- 使用蹦床引入了对
PojoeMetaMethodSite$PojoCachedMethodSietNoUnwrap.invoke()
的调用,这总共消耗了额外的 201 毫秒
- 对
CachedClass.initValue()
的所有调用总共额外消耗 98.8 毫秒
- 对
ClosureMetaClass$NormalMethodChooser.chooseMethod()
的所有调用总共额外消耗 100 毫秒
这正是为什么在您的案例中引入蹦床会使代码执行速度变慢的原因。
那为什么 @TailRecursive
做得更好呢?
简而言之 - @TailRecursive
注释用旧的 while-loop 替换了所有闭包和递归调用。 @TailRecursive
的阶乘函数在字节码级别看起来像这样:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package factorial;
import groovy.lang.GroovyObject;
import groovy.lang.MetaClass;
import java.math.BigInteger;
import org.codehaus.groovy.runtime.ScriptBytecodeAdapter;
import org.codehaus.groovy.runtime.dgmimpl.NumberNumberMultiply;
import org.codehaus.groovy.transform.tailrec.GotoRecurHereException;
public class Groovy implements GroovyObject {
public Groovy() {
MetaClass var1 = this.$getStaticMetaClass();
this.metaClass = var1;
}
public static BigInteger factorial(int number, BigInteger acc) {
BigInteger _acc_ = acc;
int _number_ = number;
try {
while(true) {
try {
while(_number_ != 1) {
int __number__ = _number_;
int var7 = _number_ - 1;
_number_ = var7;
Number var8 = NumberNumberMultiply.multiply(__number__, _acc_);
_acc_ = (BigInteger)ScriptBytecodeAdapter.castToType(var8, BigInteger.class);
}
BigInteger var4 = _acc_;
return var4;
} catch (GotoRecurHereException var13) {
;
}
}
} finally {
;
}
}
public static BigInteger factorial(int number) {
return factorial(number, (BigInteger)ScriptBytecodeAdapter.castToType(1, BigInteger.class));
}
}
我前段时间在我的博客上记录了这个用例。如果您想了解更多信息,可以阅读博客post:
https://e.printstacktrace.blog/tail-recursive-methods-in-groovy/
我正在尝试递归:
def fac
//fac = { int curr, res = 1G -> 1 >= curr ? res : fac( curr - 1, res * curr ) }
fac = { int curr, res = 1G -> 1 >= curr ? res : fac.trampoline( curr - 1, res * curr ) }
fac = fac.trampoline()
def rnd = new Random()
long s = System.currentTimeMillis()
100000.times{ fac rnd.nextInt( 40 ) }
println "done in ${System.currentTimeMillis() - s} ms / ${fac(40)}"
如果我这样使用它,我会得到这个:
done in 691 ms
如果我取消注释行 #2 和注释行 #3-4 以删除 trampoline()
和 运行 它,我得到的数字会明显减少:
done in 335 ms
因此,使用 trampoline 时,递归速度会慢 2 倍。
我错过了什么?
P.S.
如果我 运行 Scala 2.12 中的相同示例:
def fac( curr:Int, acc:BigInt = 1 ):BigInt = if( 1 >= curr ) acc else fac( curr - 1, curr * acc )
val s = System.currentTimeMillis
for( ix <- 0 until 100000 ) fac( scala.util.Random.nextInt(40).toInt )
println( s"done in ${System.currentTimeMillis - s} ms" )
它执行得更快一点:
done in 178 ms
更新
将闭包重写为带有注解的方法:
@groovy.transform.TailRecursive
def fac( int curr, res = 1G ) { 1 >= curr ? res : fac( curr - 1, res * curr ) }
// the rest
给予
done in 164 ms
并且是超级coll。尽管如此,我还是想知道trampoline()
:)
如文档中所述,Closure.trampoline()
可防止调用堆栈溢出。
Recursive algorithms are often restricted by a physical limit: the maximum stack height. For example, if you call a method that recursively calls itself too deep, you will eventually receive a
WhosebugException
.An approach that helps in those situations is by using
Closure
and its trampoline capability.Closures are wrapped in a
TrampolineClosure
. Upon calling, a trampolinedClosure
will call the originalClosure
waiting for its result. If the outcome of the call is another instance of aTrampolineClosure
, created perhaps as a result to a call to thetrampoline()
method, the Closure will again be invoked. This repetitive invocation of returned trampolined Closures instances will continue until a value other than a trampolinedClosure
is returned. That value will become the final result of the trampoline. That way, calls are made serially, rather than filling the stack.
但是,使用蹦床是有代价的。让我们看一下 JVisualVM 示例。
Non-trampoline 用例
运行 没有 trampoline()
的示例我们在 ~441 毫秒内得到结果
done in 441 ms / 815915283247897734345611269596115894272000000000
此执行分配了约 2,927,550 个对象并消耗了大约 100 MB 的内存。
CPU有点小事,除了花时间在main()
和run()
方法上,它在强制参数上花费了一些周期。
trampoline()
用例
蹦床的引入确实改变了很多。首先,与之前的尝试相比,它使执行时间几乎慢了两倍。
done in 856 ms / 815915283247897734345611269596115894272000000000
其次,它分配 ~5,931,470 (!!!) 对象并消耗~221 MB 内存。主要区别在于,在前一种情况下,所有执行都使用了一个 $_main_closure1
,而在使用蹦床的情况下 - 每次调用 trampoline()
方法都会创建:
- 一个新的
$_main_closure1
对象 - 用
CurriedClosure<T>
包裹
- 然后用
TrampolineClosure<T>
包裹起来
仅此分配了超过 1,200,000 个对象。
如果说到CPU,它还有更多的事情要做。看看数字:
- 对
TrampolineClosure<T>.<init>()
的所有调用消耗 199 毫秒 - 使用蹦床引入了对
PojoeMetaMethodSite$PojoCachedMethodSietNoUnwrap.invoke()
的调用,这总共消耗了额外的 201 毫秒 - 对
CachedClass.initValue()
的所有调用总共额外消耗 98.8 毫秒 - 对
ClosureMetaClass$NormalMethodChooser.chooseMethod()
的所有调用总共额外消耗 100 毫秒
这正是为什么在您的案例中引入蹦床会使代码执行速度变慢的原因。
那为什么 @TailRecursive
做得更好呢?
简而言之 - @TailRecursive
注释用旧的 while-loop 替换了所有闭包和递归调用。 @TailRecursive
的阶乘函数在字节码级别看起来像这样:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package factorial;
import groovy.lang.GroovyObject;
import groovy.lang.MetaClass;
import java.math.BigInteger;
import org.codehaus.groovy.runtime.ScriptBytecodeAdapter;
import org.codehaus.groovy.runtime.dgmimpl.NumberNumberMultiply;
import org.codehaus.groovy.transform.tailrec.GotoRecurHereException;
public class Groovy implements GroovyObject {
public Groovy() {
MetaClass var1 = this.$getStaticMetaClass();
this.metaClass = var1;
}
public static BigInteger factorial(int number, BigInteger acc) {
BigInteger _acc_ = acc;
int _number_ = number;
try {
while(true) {
try {
while(_number_ != 1) {
int __number__ = _number_;
int var7 = _number_ - 1;
_number_ = var7;
Number var8 = NumberNumberMultiply.multiply(__number__, _acc_);
_acc_ = (BigInteger)ScriptBytecodeAdapter.castToType(var8, BigInteger.class);
}
BigInteger var4 = _acc_;
return var4;
} catch (GotoRecurHereException var13) {
;
}
}
} finally {
;
}
}
public static BigInteger factorial(int number) {
return factorial(number, (BigInteger)ScriptBytecodeAdapter.castToType(1, BigInteger.class));
}
}
我前段时间在我的博客上记录了这个用例。如果您想了解更多信息,可以阅读博客post:
https://e.printstacktrace.blog/tail-recursive-methods-in-groovy/