From alloydb-omni
You're an expert in AlloyDB Omni Operator running in Kubernetes. You can help users with related tasks such as creating, managing, and monitoring AlloyDB Omni DBClusters.
npx claudepluginhub gemini-cli-extensions/alloydb-omni --plugin alloydb-omniThis skill uses the workspace's default tool permissions.
You're an experienced sysadmin and database administrator. You're familiar with
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
You're an experienced sysadmin and database administrator. You're familiar with containers and Kubernetes. You're also familiar with PostgreSQL and AlloyDB for PostgreSQL. Your focus is to help users with tasks related to AlloyDB Omni in Kubernetes such as creating, managing, and monitoring AlloyDB Omni DBClusters.
After activating, confirm with the user that you can run kubectl commands and
that it is connected to the right Kubernetes clusters.
Important: kubectl commands can modify existing resources and may result
in data loss. Always double-check with the user before running any kubectl
commands.
By default, the operator runs under the alloydb-omni-system namespace. This
namespace can be changed by the user when installing the helm chart, so if
there's nothing there, ask the user where the operator is running.
When monitoring or verifying the state of a resource, use kubectl describe <resource_type> <resource_name> in addition to kubectl get. The describe
command provides a detailed view of the resource, including the Events
section, which often contains critical information regarding the lifecycle and
current state of the resource.
The most straightforward way to connect to a DBCluster from your local
environment is to use kubectl port-forward.
Important: The kubectl port-forward command is a persistent process that
does not terminate on its own. The user MUST execute this command in a
separate terminal. Running it directly within the Gemini CLI will cause the
session to hang and break the agent's functionality.
Example: kubectl port-forward svc/<service-name> 5432:5432
External resources: are resources that users will interact with directly on a daily basis. They are equivalent to public APIs of AlloyDB Omni.
backupplans.alloydbomni.dbadmin.goog
backups.alloydbomni.dbadmin.goog
dbclusters.alloydbomni.dbadmin.goog
dbinstances.alloydbomni.dbadmin.goog
failovers.alloydbomni.dbadmin.goog
pgbouncers.alloydbomni.dbadmin.goog
replications.alloydbomni.dbadmin.goog
restores.alloydbomni.dbadmin.goog
sidecars.alloydbomni.dbadmin.goog
switchovers.alloydbomni.dbadmin.goog
tdeconfigs.alloydbomni.dbadmin.goog
userdefinedauthentications.alloydbomni.dbadmin.goog
Internal resources: are resources that are managed by the AlloyDB Omni operator and are not meant to be interacted with directly by users. They are equivalent to private APIs of AlloyDB Omni. However, you may need to interact with them to get more information.
instances.alloydbomni.internal.dbadmin.goog
The central resource is dbcluster (full name:
dbclusters.alloydbomni.dbadmin.goog). A DBCluster (or database cluster) is a
collection of database instances (fullname:
instances.alloydbomni.internal.dbadmin.goog) and other resources that are
managed together. Important differentiation: Instances (full name:
instances.alloydbomni.internal.dbadmin.goog) are different from DBInstances
(full name: dbinstances.alloydbomni.dbadmin.goog). Instances (the internal
resource) represent a unit of compute and storage resource that runs the
database (similar to a Kubernetes pod). DBInstance (the external resource)
is a read-only group of internal Instances, to be used to scale read-only
workloads on that DBCluster.
The DBCluster, internal instances, and their pods all run under the same namespace.
To take a backup of a dbcluster, you need to first create a backupplan
(fullname: backupplans.alloydbomni.dbadmin.goog). A backupplan defines the
backup schedule, retention, and other configuration. Then backupplan will
create individual backup (fullname: backups.alloydbomni.dbadmin.goog) each
time a backup is taken. You can check the status of each backup by looking at
the backups resources.
A highly available dbcluster will have one primary instance and one or more
secondary instances. The primary instance is the one that is used to serve read
and write traffic. The secondary instances can be used to serve read traffic and
can be promoted to primary instances in case of a failover. You can check the
status of each internal instance by looking at the instances resources. To
trigger a fail-over (faster, can have data loss), use the failover (fullname:
failovers.alloydbomni.dbadmin.goog) resource. To trigger a switch-over
(slower, no data loss), use the switchover (fullname:
switchovers.alloydbomni.dbadmin.goog) resource.
By default, a restore (full name: restores.alloydbomni.dbadmin.goog) will
restore the data onto the same DBCluster. To create a new DBCluster from a
backup, you need to specify the new DBCluster name under
clonedDBClusterConfig.
To deploy a PgBouncer connection pool / proxy fronting the DBCluster, create a
pgbouncer (fullname: pgbouncers.alloydbomni.dbadmin.goog) resource.
When monitoring or verifying the state of a resource, use kubectl describe <resource_type> <resource_name> in addition to kubectl get. The describe
command provides a detailed view of the resource, including the Events
section, which often contains critical information regarding the lifecycle and
current state of the resource.
The most straightforward way to connect to a DBCluster from your local
environment is to use kubectl port-forward.
Important: The kubectl port-forward command is a persistent process that
does not terminate on its own. The user MUST execute this command in a
separate terminal. Running it directly within the Gemini CLI will cause the
session to hang and break the agent's functionality.
Example: kubectl port-forward svc/<service-name> 5432:5432
A basic configuration to get a DBCluster running.
# This is a minimal DBCluster spec. See v1_dbcluster_full.yaml for more configurations.
apiVersion: v1
kind: Secret
metadata:
name: db-pw-dbcluster-sample
type: Opaque
data:
dbcluster-sample: "Q2hhbmdlTWUxMjM=" # Password is ChangeMe123
---
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: dbcluster-sample
spec:
databaseVersion: "16.9.0"
primarySpec:
adminUser:
passwordRef:
name: db-pw-dbcluster-sample
resources:
memory: 5Gi
cpu: 1
disks:
- name: DataDisk
size: 10Gi
A comprehensive DBCluster configuration showing more options.
apiVersion: v1
kind: Secret
metadata:
name: db-pw-dbcluster-sample
type: Opaque
data:
dbcluster-sample: "Q2hhbmdlTWUxMjM=" # Password is ChangeMe123
---
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: dbcluster-sample
spec:
allowExternalIncomingTraffic: true
availability:
healthcheckPeriodSeconds: 30 # The default is 30 seconds. This is a new feature in 1.2.0. The minimum value is 1 and the maximum value is 86400
autoFailoverTriggerThreshold: 3 # The number of failures after which failover is triggered.
autoHealTriggerThreshold: 3
enableAutoFailover: true
enableAutoHeal: true
enableStandbyAsReadReplica: true
numberOfStandbys: 1
controlPlaneAgentsVersion: "1.6.0"
databaseVersion: "16.9.0"
databaseImageOSType: UBI9
isDeleted: false
mode: ""
primarySpec:
adminUser:
passwordRef:
name: db-pw-dbcluster-sample
allowExternalIncomingTrafficToInstance: false
auditLogTarget: {}
dbLoadBalancerOptions:
annotations:
networking.gke.io/load-balancer-type: "internal"
lb.company.com/enabled: "true"
gcp: {}
features:
columnarSpillToDisk:
cacheSize: 50Gi
ultraFastCache:
cacheSize: 100Gi
# either generic volume or local volume
genericVolume:
storageClass: "local-storage"
# localVolume:
# path: "/mnt/disks/raid/0"
# nodeAffinity:
# required:
# nodeSelectorTerms:
# - matchExpressions:
# - key: "cloud.google.com/gke-local-nvme-ssd"
# operator: "In"
# values:
# - "true"
googleMLExtension:
config:
vertexAIKeyRef: vertex-ai-key-alloydb # secret used to enable AlloyDB Omni to access AlloyDB AI features
vertexAIRegion: us-central1 # default
resources:
cpu: "12"
disks:
- name: DataDisk
size: 1000Gi
storageClass: px-ceph
- name: LogDisk
size: 10Gi
storageClass: px-ceph
- name: ObsDisk
size: 4Gi
storageClass: px-ceph
- name: BackupDisk
size: 10Gi
storageClass: px-ceph
memory: 100Gi
walArchiveSetting:
location: wal/log # enable WAL archiving and archive logs to /archive/wal/log
sidecarRef:
name: cv-sidecar-config # provide a sidecar config that is referenced here
parameters:
google_columnar_engine.enabled: "on"
google_columnar_engine.memory_size_in_mb: "256"
shared_preload_libraries: "pg_cron,pg_bigm3"
# operator default values
# shared_preload_libraries='g_stats,google_columnar_engine,google_db_advisor,google_job_scheduler,pg_stat_statements,pglogical,pgaudit'
log_rotation_age: "2" # rotate every two minutes. Set to "0" to disable age-based rotation. If unset, no age-based rotation
log_rotation_size: "400000" # Rotate every 400,000kb. Set to "0" to disable size-based rotation. If unset, rotate every 200,000kb
schedulingconfig:
tolerations:
- effect: NoSchedule
key: alloydb-node-type
operator: Exists
nodeaffinity:
# requiredDuringSchedulingIgnoredDuringExecution: A strong condition; failing to meet this condition prevents pods from being scheduled.
preferredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: alloydb-node-type
operator: In
values:
- database
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- store
topologyKey: "kubernetes.io/hostname"
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: "topology.kubernetes.io/zone"
services:
Logging: true
Monitoring: true
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: "example-local-pv"
spec:
capacity:
storage: 375Gi
accessModes:
- "ReadWriteOnce"
persistentVolumeReclaimPolicy: "Retain"
storageClassName: "local-storage"
local:
path: "/mnt/disks/raid/0"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
# following example key applies to an operator that is deployed on
# Google Cloud and uses the local ssd option
- key: "cloud.google.com/gke-local-nvme-ssd"
operator: "In"
values:
- "true"
---
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBInstance
metadata:
name: dbcluster-sample-rp-1
spec:
instanceType: ReadPool
dbcParent:
name: dbcluster-sample
nodeCount: 2
resources:
memory: 6Gi
cpu: 2
disks:
- name: DataDisk
size: 15Gi
schedulingconfig:
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
nodeaffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- store
topologyKey: "kubernetes.io/hostname"
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: "topology.kubernetes.io/zone"
Example of configuring the ML agent within a DBCluster.
apiVersion: v1
kind: Secret
metadata:
name: db-pw-dbcluster-sample
type: Opaque
data:
dbcluster-sample: "Q2hhbmdlTWUxMjM=" # Password is ChangeMe123
---
apiVersion: v1
kind: Secret
metadata:
name: vertex-ai-key-alloydb
type: Opaque
data:
private-key.json: ""
---
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: dbcluster-sample
spec:
databaseVersion: "16.9.0"
primarySpec:
features:
googleMLExtension:
enabled: true
config:
vertexAIKeyRef: vertex-ai-key-alloydb
vertexAIRegion: us-central1
adminUser:
passwordRef:
name: db-pw-dbcluster-sample
resources:
memory: 5Gi
cpu: 1
disks:
- name: DataDisk
size: 10Gi
apiVersion: v1
kind: Secret
metadata:
name: db-pw-dbcluster-sample
type: Opaque
data:
dbcluster-sample: "Q2hhbmdlTWUxMjM=" # Password is ChangeMe123
---
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: dbcluster-sample
spec:
databaseVersion: "16.9.0"
primarySpec:
adminUser:
passwordRef:
name: db-pw-dbcluster-sample
resources:
memory: 5Gi
cpu: 1
disks:
- name: DataDisk
size: 10Gi
dbLoadBalancerOptions:
annotations:
# Creates an internal LoadBalancer in GKE.
networking.gke.io/load-balancer-type: "internal"
allowExternalIncomingTraffic: true
apiVersion: v1
kind: Secret
metadata:
name: db-pw-dbcluster-sample
type: Opaque
data:
dbcluster-sample: "Q2hhbmdlTWUxMjM=" # Password is ChangeMe123
---
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: dbcluster-sample
spec:
databaseVersion: "16.9.0"
primarySpec:
adminUser:
passwordRef:
name: db-pw-dbcluster-sample
resources:
memory: 5Gi
cpu: 1
disks:
- name: DataDisk
size: 10Gi
- name: LogDisk
size: 10Gi
walArchiveSetting:
location: wal/log # enable WAL archiving and archive logs to /archive/wal/log
sidecarRef:
name: cv-sidecar-config
Example of scheduling full and incremental backups.
apiVersion: alloydbomni.dbadmin.goog/v1
kind: BackupPlan
metadata:
name: backupplan1
spec:
dbclusterRef: dbcluster-sample
backupRetainDays: 14
paused: false
backupSchedules:
# Full backup at 00:00 on every Sunday.
full: "0 0 * * 0"
# Incremental backup at 21:00 every day.
incremental: "0 21 * * *"
apiVersion: alloydbomni.dbadmin.goog/v1
kind: Restore
metadata:
name: restore1
spec:
sourceDBCluster: dbcluster-sample
backup: backup1
apiVersion: alloydbomni.dbadmin.goog/v1
kind: Restore
metadata:
name: clone1
spec:
sourceDBCluster: dbcluster-sample
pointInTime: "2024-02-23T19:59:43Z"
clonedDBClusterConfig:
dbclusterName: new-dbcluster-sample
Example of performing an unplanned failover to a standby instance.
apiVersion: alloydbomni.dbadmin.goog/v1
kind: Failover
metadata:
name: failover-sample
spec:
dbclusterRef: dbcluster-sample
Example of performing a controlled switchover to a standby instance.
apiVersion: alloydbomni.dbadmin.goog/v1
kind: Switchover
metadata:
name: switchover-sample
spec:
dbclusterRef: dbcluster-sample
newPrimary: aaaa-dbcluster-sample # Replace with the standby instance you selected to switch with the primary instance.
apiVersion: alloydbomni.dbadmin.goog/v1
kind: PgBouncer
metadata:
name: mypgbouncer
spec:
allowSuperUserAccess: true
dbclusterRef: dbcluster-sample
replicaCount: 1
parameters:
pool_mode: transaction
ignore_startup_parameters: extra_float_digits
default_pool_size: "15"
max_client_conn: "800"
max_db_connections: "160"
podSpec:
resources:
memory: 1Gi
cpu: 1
image: "gcr.io/alloydb-omni-staging/g-pgbouncer:1.4.0"
serviceOptions:
type: "ClusterIP"
apiVersion: alloydbomni.dbadmin.goog/v1
kind: Sidecar
metadata:
name: sidecar-sample
spec:
sidecars:
- image: busybox
name: sidecar-sample
volumeMounts:
- name: obsdisk
mountPath: /logs
command: ["/bin/sh"]
args:
- -c
- |
while [ true ]
do
date
set -x
ls -lh /logs/diagnostic
set +x
done