函数切片参数与全局变量的性能?

Performance of function slice parameter vs global variable?

我有以下功能:

func checkFiles(path string, excludedPatterns []string) {
    // ...
}

我想知道,既然 excludedPatterns 永远不会改变,我应该通过将 var 设为全局变量(而不是每次都将其传递给函数)来优化它,还是 Golang 已经通过将它们作为写时复制?

编辑:我想我可以将切片作为指针传递,但我仍然想知道写时复制行为(如果存在的话)以及一般来说我是否应该担心按值传递或通过指针。

从函数的名称来看,性能并不是那么重要,甚至考虑将参数移动到全局变量只是为了保存 time/space 需要将它们作为参数传递(检查文件等 IO 操作很多-比调用函数并向它们传递值慢得多。

Go 中的切片只是一些小的描述符,有点像一个结构体,带有一个指向后备数组的指针和 2 个 ints,一个长度和一个容量。无论后备数组有多大,传递切片总是高效的,你甚至不应该考虑传递一个指向它们的指针,当然除非你想修改切片头。

Go 中的参数总是按值传递,并且会复制传递的值。如果传递指针,则指针值将被复制并传递。当一个切片被传递时,切片值(这是一个小的描述符)将被复制并传递——它将指向相同的后备数组(不会被复制)。

此外,如果您需要在函数中多次访问切片,一个参数通常是一个额外的收获,因为编译器可以进行进一步的优化/缓存,而如果它是一个全局变量,则必须更加小心。

有关切片及其内部结构的更多信息:Go Slices: usage and internals

如果您想要性能的确切数字,请进行基准测试!

这里有一些基准测试代码,它显示了两种解决方案之间没有区别(将切片作为参数传递或访问全局切片)。将其保存到 slices_test.go 和 运行 之类的文件中 go test -bench .

package main

import (
    "testing"
)

var gslice = make([]string, 1000)

func global(s string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = gslice // Access global-slice
    }
}

func param(s string, ss []string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = ss // Access parameter-slice
    }
}

func BenchmarkParameter(b *testing.B) {
    for i := 0; i < b.N; i++ {
        param("hi", gslice)
    }
}

func BenchmarkGlobal(b *testing.B) {
    for i := 0; i < b.N; i++ {
        global("hi")
    }
}

示例输出:

testing: warning: no tests to run
PASS
BenchmarkParameter-2    30000000                55.4 ns/op
BenchmarkGlobal-2       30000000                55.1 ns/op
ok      _/V_/workspace/IczaGo/src/play  3.569s

借助@icza 的出色回答,还有另一种将切片作为参数传递的方法:指向切片的指针。

当您需要修改底层切片时,全局变量切片起作用,但将其作为参数传递不起作用,您实际上是在使用副本。为了减轻这种情况,实际上可以将切片作为指针传递。

有趣的是,它实际上比访问全局变量更快:

package main

import (
    "testing"
)

var gslice = make([]string, 1000000)

func global(s string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = gslice // Access global-slice
    }
}

func param(s string, ss []string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = ss // Access parameter-slice
    }
}

func paramPointer(s string, ss *[]string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = ss // Access parameter-slice
    }
}

func BenchmarkParameter(b *testing.B) {
    for i := 0; i < b.N; i++ {
        param("hi", gslice)
    }
}

func BenchmarkParameterPointer(b *testing.B) {
    for i := 0; i < b.N; i++ {
        paramPointer("hi", &gslice)
    }
}

func BenchmarkGlobal(b *testing.B) {
    for i := 0; i < b.N; i++ {
        global("hi")
    }
}

结果:

goos: darwin
goarch: amd64
pkg: untitled
BenchmarkParameter-8            24437403                48.2 ns/op
BenchmarkParameterPointer-8     27483115                40.3 ns/op
BenchmarkGlobal-8               25631470                46.0 ns/op

我重写了基准测试,以便您可以比较结果。

如您所见,ParameterPointer 工作台在 10 条记录后开始领先。很有意思。

package slices_bench

import (
    "testing"
)

var gslice = make([]string, 1000)

func global(s string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = gslice // Access global-slice
    }
}

func param(s string, ss []string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = ss // Access parameter-slice
    }
}

func paramPointer(s string, ss *[]string) {
    for i := 0; i < 100; i++ { // Cycle to access slice may times
        _ = s
        _ = ss // Access parameter-slice
    }
}

func BenchmarkPerformance(b *testing.B){
    fixture := []struct {
        desc    string
        records int
    }{
        {
            desc:    "1 record",
            records: 1,
        },
        {
            desc:    "10 records",
            records: 10,
        },
        {
            desc:    "100 records",
            records: 100,
        },
        {
            desc:    "1000 records",
            records: 1000,
        },
        {
            desc:    "10000 records",
            records: 10000,
        },
        {
            desc:    "100000 records",
            records: 100000,
        },
    }

    tests := []struct {
        desc string
        fn   func(b *testing.B, n int)
    }{
        {
            desc: "ParameterPointer",
            fn: func(b *testing.B, n int) {
                for j := 0; j < n; j++ {
                    paramPointer("hi", &gslice)
                }
            },
        },
        {
            desc: "Parameter",
            fn: func(b *testing.B, n int) {
                for j := 0; j < n; j++ {
                    param("hi", gslice)
                }
            },
        },
        {
            desc: "Global",
            fn: func(b *testing.B, n int) {
                for j := 0; j < n; j++ {
                    global("hi")
                }
            },
        },
    }

    for _, t := range tests {
        b.Run(t.desc, func(b *testing.B) {
            for _, f := range fixture {
                b.Run(f.desc, func(b *testing.B) {
                    b.ReportAllocs()
                    b.ResetTimer()
                    for i := 0; i < b.N; i++ {
                        t.fn(b, f.records)
                    }
                })
            }
        })
    }
}

结果:

goos: windows
goarch: amd64
pkg: benchs/slices-bench
cpu: Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
BenchmarkPerformance
BenchmarkPerformance/ParameterPointer
BenchmarkPerformance/ParameterPointer/1_record
BenchmarkPerformance/ParameterPointer/1_record-16           38661910            31.18 ns/op        0 B/op          0 allocs/op
BenchmarkPerformance/ParameterPointer/10_records
BenchmarkPerformance/ParameterPointer/10_records-16          4160023           288.4 ns/op         0 B/op          0 allocs/op
BenchmarkPerformance/ParameterPointer/100_records
BenchmarkPerformance/ParameterPointer/100_records-16          445131          2748 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/ParameterPointer/1000_records
BenchmarkPerformance/ParameterPointer/1000_records-16          43876         27380 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/ParameterPointer/10000_records
BenchmarkPerformance/ParameterPointer/10000_records-16          4441        273922 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/ParameterPointer/100000_records
BenchmarkPerformance/ParameterPointer/100000_records-16          439       2739282 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Parameter
BenchmarkPerformance/Parameter/1_record
BenchmarkPerformance/Parameter/1_record-16                  39860619            30.79 ns/op        0 B/op          0 allocs/op
BenchmarkPerformance/Parameter/10_records
BenchmarkPerformance/Parameter/10_records-16                 4152728           288.6 ns/op         0 B/op          0 allocs/op
BenchmarkPerformance/Parameter/100_records
BenchmarkPerformance/Parameter/100_records-16                 445634          2757 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Parameter/1000_records
BenchmarkPerformance/Parameter/1000_records-16                 43618         27496 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Parameter/10000_records
BenchmarkPerformance/Parameter/10000_records-16                 4450        273960 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Parameter/100000_records
BenchmarkPerformance/Parameter/100000_records-16                 435       2739053 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Global
BenchmarkPerformance/Global/1_record
BenchmarkPerformance/Global/1_record-16                     38813095            30.97 ns/op        0 B/op          0 allocs/op
BenchmarkPerformance/Global/10_records
BenchmarkPerformance/Global/10_records-16                    4148433           288.4 ns/op         0 B/op          0 allocs/op
BenchmarkPerformance/Global/100_records
BenchmarkPerformance/Global/100_records-16                    429274          2758 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Global/1000_records
BenchmarkPerformance/Global/1000_records-16                    43591         27412 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Global/10000_records
BenchmarkPerformance/Global/10000_records-16                    4521        274420 ns/op           0 B/op          0 allocs/op
BenchmarkPerformance/Global/100000_records
BenchmarkPerformance/Global/100000_records-16                    436       2751042 ns/op           0 B/op          0 allocs/op