Kubernetes Evolution: Transitioning from etcd to Distributed SQL

I lately stumbled upon an article explaining methods to substitute etcd with PostgreSQL. This transition was seamless with the Kine undertaking, which serves as an exterior etcd endpoint, translating Kubernetes etcd requests into SQL queries for an underlying relational database. 

Impressed by this strategy, I made a decision to discover Kine’s potential additional by switching from etcd to YugabyteDB, a distributed SQL database constructed on PostgreSQL.

What’s the Drawback With etcd?

Etcd is a key-value retailer utilized by Kubernetes to deal with all cluster information.

It would not sometimes demand your consideration till you encounter scalability or excessive availability (HA) points together with your Kubernetes cluster. Managing etcd in a scalable and HA manner is especially difficult for big Kubernetes deployments.

Additionally, there’s been mounting concern throughout the Kubernetes neighborhood concerning the way forward for the etcd undertaking. Its community is dwindling, and only a few maintainers are left with the curiosity (and functionality) to help and advance this undertaking.

These issues gave rise to Kine, an etcd API to SQL translation layer. Kine formally helps SQLite, PostgreSQL, and MySQL—methods that proceed to develop in utilization and boast sturdy communities.

Why Select Distributed SQL Databases?

Though PostgreSQL, SQLite, and MySQL are wonderful choices for Kubernetes, they’re designed and optimized for single-server deployments. Because of this they will current some challenges, significantly for big Kubernetes deployments which have extra stringent scalability and availability necessities.

In case your Kubernetes cluster requires an RPO (Restoration Level Goal) of zero and an RTO (Restoration Time Goal) measured in single-digit seconds, the structure and upkeep of MySQL or PostgreSQL deployments will probably be a problem. In the event you’re occupied with delving deeper into this subject, you may discover PostgreSQL excessive availability choices here.

Distributed SQL databases operate as a cluster of interconnected nodes that may be deployed throughout a number of racks, availability zones, or areas. By design, they’re extremely obtainable and scalable and, thus, can enhance the identical traits for Kubernetes.

Beginning Kine on YugabyteDB

My resolution to make use of YugabyteDB because the distributed SQL database for Kubernetes was influenced by PostgreSQL. YugabyteDB is constructed on the PostgreSQL supply code, reusing the higher half of PostgreSQL (the question engine) whereas offering its personal distributed storage implementation.

The shut ties between YugabyteDB and PostgreSQL permit us to repurpose the PostgreSQL implementation of Kine for YugabyteDB. Nonetheless, keep tuned, this may not be a easy lift-and-shift story!

Now, let’s translate these concepts into motion and begin Kine on YugabyteDB. For this, I am using an Ubuntu 22.04 digital machine geared up with 8 CPUs and 32GB of RAM.

First, we launch a three-node YugabyteDB cluster on the machine. It is acceptable to experiment with a distributed SQL database on a single server earlier than going distributed. There are a number of methods to kick off YugabyteDB domestically, however my most well-liked technique is by way of Docker:

mkdir ~/yb_docker_data

docker community create custom-network

docker run -d --name yugabytedb_node1 --net custom-network 
  -p 15433:15433 -p 7001:7000 -p 9000:9000 -p 5433:5433 
  -v ~/yb_docker_data/node1:/dwelling/yugabyte/yb_data --restart unless-stopped 
  bin/yugabyted begin --tserver_flags="ysql_sequence_cache_minval=1" 
  --base_dir=/dwelling/yugabyte/yb_data --daemon=false

docker run -d --name yugabytedb_node2 --net custom-network 
  -p 15434:15433 -p 7002:7000 -p 9002:9000 -p 5434:5433 
  -v ~/yb_docker_data/node2:/dwelling/yugabyte/yb_data --restart unless-stopped 
  bin/yugabyted begin --join=yugabytedb_node1 --tserver_flags="ysql_sequence_cache_minval=1" 
  --base_dir=/dwelling/yugabyte/yb_data --daemon=false
docker run -d --name yugabytedb_node3 --net custom-network 
  -p 15435:15433 -p 7003:7000 -p 9003:9000 -p 5435:5433 
  -v ~/yb_docker_data/node3:/dwelling/yugabyte/yb_data --restart unless-stopped 
  bin/yugabyted begin --join=yugabytedb_node1 --tserver_flags="ysql_sequence_cache_minval=1" 
  --base_dir=/dwelling/yugabyte/yb_data --daemon=false

Word: I am beginning YugabyteDB nodes with the ysql_sequence_cache_minval=1 setting to make sure that database sequences could be incremented sequentially by 1. With out this feature, a single Kine connection to YugabyteDB will cache the next 100 IDs of a sequence. This might result in “model mismatch” errors throughout a Kubernetes cluster bootstrap as a result of one Kine connection may very well be inserting information with IDs starting from 1 to 100 whereas one other may very well be inserting information with IDs from 101 to 200.

Subsequent, begin a Kine occasion connecting to YugabyteDB utilizing the PostgreSQL implementation:

1. Clone the Kine repo:

git clone https://github.com/k3s-io/kine.git && cd kine

2. Begin a Kine occasion connecting to the native YugabyteDB cluster:

go run . --endpoint postgres://yugabyte:[email protected]:5433/yugabyte

3. Connect with YugabyteDB and ensure the Kine schema is prepared:

psql -h -p 5433 -U yugabyte

yugabyte=# d
           Listing of relations
Schema |    Title     |   Sort   |  Proprietor
 public | kine        | desk    | yugabyte
 public | kine_id_seq | sequence | yugabyte
(2 rows)

Good, the primary take a look at has been a hit. Kine treats YugabyteDB as PostgreSQL and begins with none points. Now we progress to the following part: launching Kubernetes on prime of Kine with YugabyteDB.

Beginning Kubernetes on Kine With YugabyteDB

Kine can be utilized by varied Kubernetes engines, together with commonplace K8s deployments, Rancher Kubernetes Engine (RKE), or K3s (a light-weight Kubernetes engine). For simplicity’s sake, I will use the latter.

A K3s cluster could be began with a single command:

  1. Cease the Kine occasion began within the earlier part.
  2. Begin K3s connecting to the identical native YugabyteDB cluster (K3s executable is shipped with Kine):
curl -sfL https://get.k3s.io | sh -s - server --write-kubeconfig-mode=644 
--datastore-endpoint="postgres://yugabyte:[email protected]:5433/yugabyte"

3. Kubernetes ought to begin with no points and we are able to affirm that by operating the next command:

k3s kubectl get nodes
NAME        STATUS   ROLES                  AGE     VERSION
ubuntu-vm   Prepared    control-plane,grasp   7m13s   v1.27.3+k3s1

Wonderful, Kubernetes features seamlessly on YugabyteDB. That is attainable because of YugabyteDB’s excessive feature and runtime compatibility with PostgreSQL. Because of this we are able to reuse most libraries, drivers, and frameworks created for PostgreSQL.

This might have marked the top of our journey, however as a diligent engineer, I made a decision to evaluate the K3s logs. Sometimes, throughout a Kubernetes bootstrap, the logs might report gradual queries, just like the one under:

INFO[0015] Gradual SQL(complete time: 3s) :
                MAX(rkv.id) AS id
                kine AS rkv),
                    MAX(crkv.prev_revision) AS prev_revision
                    kine AS crkv
                    crkv.title="compact_rev_key"), kv.id AS theid, kv.title, kv.created, kv.deleted, kv.create_revision, kv.prev_revision, kv.lease, kv.worth, kv.old_value
                kine AS kv
                JOIN (
                        MAX(mkv.id) AS id
                        kine AS mkv
                        mkv.title LIKE $1
                    GROUP BY
                        mkv.title) AS maxkv ON maxkv.id = kv.id
                    kv.deleted = 0
                    OR $2) AS lkv
    lkv.theid ASC
LIMIT 10001

This is probably not a big concern when operating YugabyteDB on a single machine, however as soon as we swap to a distributed setting, queries like this may turn into hotspots and create bottlenecks.

Because of this, I cloned the Kine supply code and commenced to discover the PostgreSQL implementation for potential optimization alternatives.

Optimizing Kine for YugabyteDB

Right here, I had the chance to collaborate with Franck Pachot, a database guru well-versed in optimizing the SQL layer with no or minimal modifications within the utility logic. 

After inspecting the database schema generated by Kine and using EXPLAIN ANALYZE for sure queries, Franck advised important optimizations that will be helpful for any distributed SQL database.

Thankfully, the optimizations didn’t necessitate any modifications to the Kine utility logic. All I needed to do was introduce a couple of SQL-level enhancements. Consequently, a Kine fork with direct help for YugabyteDB was created.

In the meantime, there are three optimizations within the YugabyteDB implementation in comparison with the PostgreSQL one:

  1. The first index for the kine desk has been modified from PRIMARY INDEX (id) to PRIMARY INDEX (id asc). By default, YugabyteDB makes use of hash sharding to distribute information evenly throughout the cluster. Nonetheless, Kubernetes runs many vary queries over the id column, which makes it affordable to change to vary sharding.
  2. The kine_name_prev_revision_uindex index has been up to date to be a masking index by together with the id column within the index definition:

    CREATE UNIQUE INDEX IF NOT EXISTS kine_name_prev_revision_uindex ON kine (title asc, prev_revision asc) INCLUDE(id);

    YugabyteDB distributes indexes equally to desk information. Due to this fact, it may be that an index entry references an id saved on a unique YugabyteDB node. To keep away from an additional community spherical journey between the nodes, we are able to embody the id within the secondary index.

  3. Kine performs many joins whereas fulfilling Kubernetes requests. If the question planner/optimizer decides to make use of a nested loop be a part of, then by default, the YugabyteDB question layer will probably be studying and becoming a member of one file at a time. To expedite the method, we are able to allow batched nested loop joins. The Kine implementation for YugabyteDB does this by executing the next assertion at startup:

    ALTER DATABASE " + dbName + " set yb_bnl_batch_size=1024;

Let’s give this optimized YugabyteDB implementation a attempt.

First, cease the earlier K3s service and drop the Kine schema from the YugabyteDB cluster:

1. Cease and delete the K3s service:

sudo /usr/native/bin/k3s-uninstall.sh
sudo rm -r /and so forth/rancher

2. Drop the schema:

psql -h -p 5433 -U yugabyte

drop desk kine cascade;

Subsequent, begin a Kine occasion that gives the optimized model for YugabyteDB:

1. Clone the fork:

git clone https://github.com/dmagda/kine-yugabytedb.git && cd kine-yugabytedb

2. Begin Kine:

go run . --endpoint "yugabytedb://yugabyte:[email protected]:5433/yugabyte"

Kine initiates with none points. The one distinction now’s that as an alternative of specifying 'postgres' within the connection string, we point out 'yugabytedb' to allow the optimized YugabyteDB implementation. Relating to the precise communication between Kine and YugabyteDB, Kine continues to make use of the usual PostgreSQL driver for Go.

Constructing Kubernetes on an Optimized Model of Kine

Lastly, let’s begin K3s on this optimized model of Kine. 

To do this, we first have to build K3s from sources:

1. Cease the Kine occasion began within the part above.

2. Clone the K3s repository:

git clone --depth 1 https://github.com/k3s-io/k3s.git && cd k3s

3. Open the go.mod file and add the next line to the top of the substitute (..) part:

github.com/k3s-io/kine => github.com/dmagda/kine-yugabytedb v0.2.0

This instruction tells Go to make use of the most recent launch of the Kine fork with the YugabyteDB implementation.

4. Allow help for personal repositories and modules:

go env -w GOPRIVATE=github.com/dmagda/kine-yugabytedb

5. Ensure that the modifications take impact:

6. Put together to construct a full model of K3s:

mkdir -p construct/information && make obtain && make generate

7. Construct the total model:

It ought to take round 5 minutes to complete the construct. 

Word: when you cease experimenting with this practice K3s construct you may uninstall it following this instruction.

Working the Pattern Workload on an Optimized Kubernetes Model

As soon as the construct is prepared, we are able to begin K3s with the optimized model of Kine.

1. Navigate to the listing with the construct artifacts:

2. Begin K3s by connecting to the native YugabyteDB cluster:

sudo ./k3s server 
  --datastore-endpoint="yugabytedb://yugabyte:[email protected]:5433/yugabyte"

3. Affirm Kubernetes began efficiently:

sudo ./k3s kubectl get nodes

NAME        STATUS   ROLES                  AGE     VERSION
ubuntu-vm   Prepared    control-plane,grasp   4m33s   v1.27.4+k3s-36645e73

Now, let’s deploy a pattern utility to make sure that the Kubernetes cluster is able to extra than simply bootstrapping itself:

1. Clone a repository with Kubernetes examples:

git clone https://github.com/digitalocean/kubernetes-sample-apps.git

2. Deploy the Emojivoto utility:

sudo ./k3s kubectl apply -k ./kubernetes-sample-apps/emojivoto-example/kustomize

3. Ensure that all deployments and providers begin efficiently:

sudo ./k3s kubectl get all -n emojivoto

NAME                            READY   STATUS    RESTARTS   AGE
pod/vote-bot-565bd6bcd8-rnb6x   1/1     Working   0          25s
pod/web-75b9df87d6-wrznp        1/1     Working   0          24s
pod/voting-f5ddc8ff6-69z6v      1/1     Working   0          25s
pod/emoji-66658f4b4c-wl4pt      1/1     Working   0          25s

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/emoji-svc    ClusterIP    <none>        8080/TCP,8801/TCP   27s
service/voting-svc   ClusterIP    <none>        8080/TCP,8801/TCP   27s
service/web-svc      ClusterIP   <none>        80/TCP              27s

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/vote-bot   1/1     1            1           26s
deployment.apps/net        1/1     1            1           25s
deployment.apps/voting     1/1     1            1           26s
deployment.apps/emoji      1/1     1            1           26s

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/vote-bot-565bd6bcd8   1         1         1       26s
replicaset.apps/web-75b9df87d6        1         1         1       25s
replicaset.apps/voting-f5ddc8ff6      1         1         1       26s
replicaset.apps/emoji-66658f4b4c      1         1         1       26s

4. Make a name to the service/web-svc utilizing the CLUSTER_IP:80 to set off the appliance logic:

The app will reply with the next HTML:

<!DOCTYPE html>
    <meta charset="UTF-8">
    <title>Emoji Vote</title>
    <hyperlink rel="icon" href="https://dzone.com/img/favicon.ico">

    <script async src="https://www.googletagmanager.com/gtag/js?id=UA-60040560-4"></script>
      window.dataLayer = window.dataLayer || [];
      operate gtag()dataLayer.push(arguments);
      gtag('js', new Date());
      gtag('config', 'UA-60040560-4');
    <div id="foremost" class="foremost"></div>

  <script sort="textual content/javascript" src="/js" async></script>


In Abstract

Job achieved! Kubernetes can now use YugabyteDB as a distributed and extremely obtainable SQL database for all its information. 

This enables us to proceed to the following part: deploying Kubernetes and YugabyteDB in a real cloud surroundings throughout a number of availability zones and areas, and testing how the answer handles varied outages. This warrants a separate weblog put up, so stayed tuned!