恐慌:无效页面类型:2:10 和第 5 页已在 Kubed Chronograf 上释放

panic: invalid page type: 2: 10 & page 5 already freed on Kubed Chronograf

我在 Azure kubernetes 服务上安装了 influx v1.7.9,我尝试添加 chronograf 但未能启动,它有一个 PVC 可以使用 azure 存储帐户存储数据。

    panic: invalid page type: 2: 10

    goroutine 1 [running]:
    github.com/boltdb/bolt.(*Cursor).search(0xc000551960, 0x2f083e0, 0x5, 0x5, 0x2)
        /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/cursor.go:256 +0x354
    github.com/boltdb/bolt.(*Cursor).seek(0xc000551960, 0x2f083e0, 0x5, 0x5, 0xc000b03640, 0x40d619, 0xc0000b92e8, 0x8, 0x8, 0x1417b00, ...)
        /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/cursor.go:159 +0x7e
    github.com/boltdb/bolt.(*Bucket).Bucket(0xc000116478, 0x2f083e0, 0x5, 0x5, 0xc000b03718)
        /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/bucket.go:112 +0xef
    github.com/boltdb/bolt.(*Tx).Bucket(...)
        /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/tx.go:101
    github.com/influxdata/chronograf/bolt.(*BuildStore).get(0xc0000b9188, 0x207e0e0, 0xc0000b4010, 0xc000116460, 0xc00000cea0, 0xc000b03748, 0x42da8f, 0xc000000008, 0xc0000bc040, 0x0)
        /root/go/src/github.com/influxdata/chronograf/bolt/build.go:66 +0x77
    github.com/influxdata/chronograf/bolt.(*BuildStore).Get.func1(0xc000116460, 0x1ed3cb8, 0xc000116460)
        /root/go/src/github.com/influxdata/chronograf/bolt/build.go:30 +0x53
    github.com/boltdb/bolt.(*DB).View(0xc00000cd20, 0xc000b037e8, 0x0, 0x0)
        /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/db.go:626 +0x90
    github.com/influxdata/chronograf/bolt.(*BuildStore).Get(0xc0000b9188, 0x207e0e0, 0xc0000b4010, 0xc00047ba00, 0xc000b03880, 0x4d293d, 0xc0000462aa, 0x24, 0xc00000cd20)
        /root/go/src/github.com/influxdata/chronograf/bolt/build.go:28 +0xa2
    github.com/influxdata/chronograf/bolt.(*Client).backup(0xc0000c0dc0, 0x207e0e0, 0xc0000b4010, 0x202af30, 0x6, 0x2071000, 0x28, 0xc000b03900, 0x418bfb)
        /root/go/src/github.com/influxdata/chronograf/bolt/client.go:267 +0x4a
    github.com/influxdata/chronograf/bolt.(*Client).Open(0xc0000c0dc0, 0x207e0e0, 0xc0000b4010, 0x2082660, 0xc0000b9058, 0x202af30, 0x6, 0x2071000, 0x28, 0xc000b03a98, ...)
        /root/go/src/github.com/influxdata/chronograf/bolt/client.go:107 +0x58a
    github.com/influxdata/chronograf/server.openService(0x207e0e0, 0xc0000b4010, 0x202af30, 0x6, 0x2071000, 0x28, 0xc0000462aa, 0x24, 0x2056de0, 0xc0003dffb0, ...)
        /root/go/src/github.com/influxdata/chronograf/server/server.go:451 +0x143
    github.com/influxdata/chronograf/server.(*Server).Serve(0xc000095880, 0x207e0e0, 0xc0000b4010, 0x0, 0x0)
        /root/go/src/github.com/influxdata/chronograf/server/server.go:343 +0x498
    main.main()
        /root/go/src/github.com/influxdata/chronograf/cmd/chronograf/main.go:47 +0x1ec

现在,当我从卷中删除 chronograf 部署和文件时,出现了其他错误。

panic: page 5 already freed

goroutine 1 [running]:
github.com/boltdb/bolt.(*freelist).free(0xc000710f60, 0x3, 0x7f1c06bd8000)
    /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/freelist.go:117 +0x2a6
github.com/boltdb/bolt.(*Tx).Commit(0xc000116620, 0x0, 0x0)
    /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/tx.go:176 +0x1b7
github.com/boltdb/bolt.(*DB).Update(0xc00000c1e0, 0xc000a8f790, 0x0, 0x0)
    /root/go/pkg/mod/github.com/boltdb/bolt@v0.0.0-20160719165138-5cc10bbbc5c1/db.go:602 +0xe8
github.com/influxdata/chronograf/bolt.(*OrganizationsStore).CreateDefault(0xc0000b88b8, 0x207e0e0, 0xc0000b4010, 0x0, 0x0)
    /root/go/src/github.com/influxdata/chronograf/bolt/organizations.go:55 +0x1b5
github.com/influxdata/chronograf/bolt.(*OrganizationsStore).Migrate(...)
    /root/go/src/github.com/influxdata/chronograf/bolt/organizations.go:37
github.com/influxdata/chronograf/bolt.(*Client).migrate(0xc0002aa0a0, 0x207e0e0, 0xc0000b4010, 0x202af30, 0x6, 0x2071000, 0x28, 0x0, 0x0)
    /root/go/src/github.com/influxdata/chronograf/bolt/client.go:186 +0x64
github.com/influxdata/chronograf/bolt.(*Client).Open(0xc0002aa0a0, 0x207e0e0, 0xc0000b4010, 0x2082660, 0xc0000b8120, 0x202af30, 0x6, 0x2071000, 0x28, 0xc000a8fa98, ...)
    /root/go/src/github.com/influxdata/chronograf/bolt/client.go:116 +0x423
github.com/influxdata/chronograf/server.openService(0x207e0e0, 0xc0000b4010, 0x202af30, 0x6, 0x2071000, 0x28, 0xc0000462aa, 0x24, 0x2056de0, 0xc000710e70, ...)
    /root/go/src/github.com/influxdata/chronograf/server/server.go:451 +0x143
github.com/influxdata/chronograf/server.(*Server).Serve(0xc00038a380, 0x207e0e0, 0xc0000b4010, 0x0, 0x0)
    /root/go/src/github.com/influxdata/chronograf/server/server.go:343 +0x498
main.main()
    /root/go/src/github.com/influxdata/chronograf/cmd/chronograf/main.go:47 +0x1ec

配置 YAML 文件。

部署:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: chronograf
spec:
  replicas: 1
  selector:
    matchLabels:
      component: chronograf
  template:
    metadata:
      labels:
        component: chronograf
    spec:
      initContainers:
        - name: wait-services
          image: busybox
          command: ['sh', '-c', 'until nslookup influx-svc.default.svc.cluster.local; do echo waiting service start; sleep 2; done;']
      containers:
        - name: chronograf
          image: chronograf:1.7.16
          ports:
            - containerPort: 8888
              name: http
          env:
            - name: "influxdb-url"
              value: "http://influx-svc:8086"
          volumeMounts:
            - mountPath: /var/lib/chronograf
              name: data
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: chronograf-data

服务:

apiVersion: v1
kind: Service
metadata:
  name: chronograf-dashboard
  labels:
    component: chronograf
spec:
  externalTrafficPolicy: Cluster
  type: LoadBalancer
  selector:
    component: chronograf
  ports:
    - port: 80
      name: http
      targetPort: 8888

持久卷声明:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: chronograf-data
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: storage-account
  resources:
    requests:
      storage: 5Gi

根据测试,我发现问题一定是存储class的属性 mountOptions引起的。当我使用 AKS 提供的存储 class 示例 here:

时,我遇到了与您相同的错误
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azurefile
provisioner: kubernetes.io/azure-file
mountOptions:
  - dir_mode=0777
  - file_mode=0777
  - uid=1000
  - gid=1000
  - mfsymlinks
  - nobrl
  - cache=none
parameters:
  skuName: Standard_LRS

当您删除 属性 mountOptions 让存储 class 动态定义自己时,它会完美地工作。这里的存储class:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azurefile
provisioner: kubernetes.io/azure-file
parameters:
  skuName: Standard_LRS

如果你真的想知道哪个挂载选项有问题,你可以自己做更多的测试。此外,Azure 磁盘的持久卷也很适合你。祝你好运!