kubernetes timescaledb statefulset:更改在 pod 重新创建时丢失
kubernetes timescaledb statefulset: Changes lost on pod recreation
我有一个 Timescaledb 服务器 运行 作为 AKS 中的 StatefulSet。当我删除并重新创建 timescaledb pod 时,即使 pod 关联到最初关联的 PV(持久卷),更改也会丢失。感谢任何帮助。
下面是运行kubectl get statefulset timescaledb -o yaml
提取的statefulset的PV,PVC配置
template:
metadata:
creationTimestamp: null
labels:
app: timescaledb
spec:
containers:
- args:
- -c
- config_file=/etc/postgresql/postgresql.conf
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: timescaledb-secret
image: docker.io/timescale/timescaledb:latest-pg9.6
name: timescaledb-backend
ports:
- containerPort: 5432
name: server
protocol: TCP
resources:
requests:
cpu: "3"
memory: 6Gi
volumeMounts:
- mountPath: /var/lib/postgresql
name: timescaledbdata
- mountPath: /etc/postgresql
name: timescaledb-config
volumes:
- configMap:
defaultMode: 420
name: timescaledb-config
name: timescaledb-config
volumeClaimTemplates:
- metadata:
annotations:
volume.alpha.kubernetes.io/storage-class: standard
creationTimestamp: null
name: timescaledbdata
spec:
accessModes:
- ReadWriteOnce
dataSource: null
resources:
requests:
storage: 200Gi
status:
phase: Pending
下面演示创建的临时数据库 test_db 在 pod 重建后丢失,并且在整个过程中,pod 关联到 Azure 上的相同 PV/disk。
root@e70a91715239:~/keys# k get pvc -l app=timescaledb
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
timescaledbdata-timescaledb-0 Bound pvc-c7eb99cf-6a6b-11e9-b661-be660567cc75 200Gi RWO default 83d
root@e70a91715239:~/keys# k exec -ti timescaledb-0 bash
bash-4.4# psql -U postgres;
psql (9.6.13)
Type "help" for help.
postgres=# create database test_db;
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
test_db | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
(4 rows)
root@e70a91715239:~/keys# k get pods | grep timescale
timescaledb-0 1/1 Running 0 12m
root@e70a91715239:~/keys# k delete pod/timescaledb-0
pod "timescaledb-0" deleted
root@e70a91715239:~/keys# k get pods | grep timescale
timescaledb-0 1/1 Running 0 14s
root@e70a91715239:~/keys# k exec -ti timescaledb-0 bash
bash-4.4# psql -U postgres
psql (9.6.13)
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
(3 rows)
root@e70a91715239:~/keys# k get pvc -l app=timescaledb
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
timescaledbdata-timescaledb-0 Bound pvc-c7eb99cf-6a6b-11e9-b661-be660567cc75 200Gi RWO default 83d
可能它正在按照提示重新初始化。请参阅 logs。关于为什么会这样做的任何指示。
更新 1:
我查看了 timescale
pod 中的挂载,/var/lib/postgresql
和 /var/lib/postgresql/data
似乎有不同的分区。我不明白为什么。
Filesystem Size Used Available Use% Mounted on
overlay 96.9G 22.1G 74.8G 23% /
tmpfs 64.0M 0 64.0M 0% /dev
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/sda1 96.9G 22.1G 74.8G 23% /docker-entrypoint-initdb.d
/dev/sda1 96.9G 22.1G 74.8G 23% /dev/termination-log
shm 64.0M 4.0K 64.0M 0% /dev/shm
/dev/sda1 96.9G 22.1G 74.8G 23% /etc/resolv.conf
/dev/sda1 96.9G 22.1G 74.8G 23% /etc/hostname
/dev/sda1 96.9G 22.1G 74.8G 23% /etc/hosts
/dev/sdc 196.7G 59.3M 196.7G 0% /var/lib/postgresql
/dev/sda1 96.9G 22.1G 74.8G 23% /var/lib/postgresql/data
不明白上面的挂载是如何在下面的配置中发生的
volumeMounts:
- mountPath: /var/lib/postgresql
name: timescaledbdata
- mountPath: /etc/postgresql
name: timescaledb-config
问题是 postgres:9.6
Dockerfile 中有 /var/lib/postgresql/data
的 VOLUME 声明,这导致容器上有额外的挂载。当我们在 /var/lib/postgresql
处安装卷时,该安装是短暂的。但是我们无法将 AKS 卷挂载到 /var/lib/postgresql/data
,因为该卷带有 lost+found
子目录,而 Postgres 需要空目录来存储数据库文件。
修复方法是在 /var/lib/postgresql/data
上安装卷并告诉 Postgres 使用 /var/lib/postgresql/data
下的子目录来存储具有 PGDATA
环境变量的文件。
以下是k8s statefulset配置中修复的相关部分
env:
- name: PGDATA
value: "/var/lib/postgresql/data/dbfiles"
...
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: timescaledata
我有一个 Timescaledb 服务器 运行 作为 AKS 中的 StatefulSet。当我删除并重新创建 timescaledb pod 时,即使 pod 关联到最初关联的 PV(持久卷),更改也会丢失。感谢任何帮助。
下面是运行kubectl get statefulset timescaledb -o yaml
template:
metadata:
creationTimestamp: null
labels:
app: timescaledb
spec:
containers:
- args:
- -c
- config_file=/etc/postgresql/postgresql.conf
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: timescaledb-secret
image: docker.io/timescale/timescaledb:latest-pg9.6
name: timescaledb-backend
ports:
- containerPort: 5432
name: server
protocol: TCP
resources:
requests:
cpu: "3"
memory: 6Gi
volumeMounts:
- mountPath: /var/lib/postgresql
name: timescaledbdata
- mountPath: /etc/postgresql
name: timescaledb-config
volumes:
- configMap:
defaultMode: 420
name: timescaledb-config
name: timescaledb-config
volumeClaimTemplates:
- metadata:
annotations:
volume.alpha.kubernetes.io/storage-class: standard
creationTimestamp: null
name: timescaledbdata
spec:
accessModes:
- ReadWriteOnce
dataSource: null
resources:
requests:
storage: 200Gi
status:
phase: Pending
下面演示创建的临时数据库 test_db 在 pod 重建后丢失,并且在整个过程中,pod 关联到 Azure 上的相同 PV/disk。
root@e70a91715239:~/keys# k get pvc -l app=timescaledb
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
timescaledbdata-timescaledb-0 Bound pvc-c7eb99cf-6a6b-11e9-b661-be660567cc75 200Gi RWO default 83d
root@e70a91715239:~/keys# k exec -ti timescaledb-0 bash
bash-4.4# psql -U postgres;
psql (9.6.13)
Type "help" for help.
postgres=# create database test_db;
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
test_db | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
(4 rows)
root@e70a91715239:~/keys# k get pods | grep timescale
timescaledb-0 1/1 Running 0 12m
root@e70a91715239:~/keys# k delete pod/timescaledb-0
pod "timescaledb-0" deleted
root@e70a91715239:~/keys# k get pods | grep timescale
timescaledb-0 1/1 Running 0 14s
root@e70a91715239:~/keys# k exec -ti timescaledb-0 bash
bash-4.4# psql -U postgres
psql (9.6.13)
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
(3 rows)
root@e70a91715239:~/keys# k get pvc -l app=timescaledb
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
timescaledbdata-timescaledb-0 Bound pvc-c7eb99cf-6a6b-11e9-b661-be660567cc75 200Gi RWO default 83d
可能它正在按照提示重新初始化。请参阅 logs。关于为什么会这样做的任何指示。
更新 1:
我查看了 timescale
pod 中的挂载,/var/lib/postgresql
和 /var/lib/postgresql/data
似乎有不同的分区。我不明白为什么。
Filesystem Size Used Available Use% Mounted on
overlay 96.9G 22.1G 74.8G 23% /
tmpfs 64.0M 0 64.0M 0% /dev
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/sda1 96.9G 22.1G 74.8G 23% /docker-entrypoint-initdb.d
/dev/sda1 96.9G 22.1G 74.8G 23% /dev/termination-log
shm 64.0M 4.0K 64.0M 0% /dev/shm
/dev/sda1 96.9G 22.1G 74.8G 23% /etc/resolv.conf
/dev/sda1 96.9G 22.1G 74.8G 23% /etc/hostname
/dev/sda1 96.9G 22.1G 74.8G 23% /etc/hosts
/dev/sdc 196.7G 59.3M 196.7G 0% /var/lib/postgresql
/dev/sda1 96.9G 22.1G 74.8G 23% /var/lib/postgresql/data
不明白上面的挂载是如何在下面的配置中发生的
volumeMounts:
- mountPath: /var/lib/postgresql
name: timescaledbdata
- mountPath: /etc/postgresql
name: timescaledb-config
问题是 postgres:9.6
Dockerfile 中有 /var/lib/postgresql/data
的 VOLUME 声明,这导致容器上有额外的挂载。当我们在 /var/lib/postgresql
处安装卷时,该安装是短暂的。但是我们无法将 AKS 卷挂载到 /var/lib/postgresql/data
,因为该卷带有 lost+found
子目录,而 Postgres 需要空目录来存储数据库文件。
修复方法是在 /var/lib/postgresql/data
上安装卷并告诉 Postgres 使用 /var/lib/postgresql/data
下的子目录来存储具有 PGDATA
环境变量的文件。
以下是k8s statefulset配置中修复的相关部分
env:
- name: PGDATA
value: "/var/lib/postgresql/data/dbfiles"
...
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: timescaledata