Build Snapshots

Using image build snapshots

During image builds, Kleene takes filesystem snapshots of the build container’s filesystem after successfully running a RUN or COPY instruction. Containers and images can be created from these snapshots, which can be useful when developing images. For instance, when an image build fail and you need to understand why it crashed, you can investigate the runtime environment as it looked before the build failed.

This is illustrated with a somewhat artifical image development scenario which, nevertheless, aims to illustrate how to: Use image snapshots as a flexible tool for image development, and reduce build-times between test-builds.

Start with the following draft Dockerfile:

FROM FreeBSD:latest
RUN pkg install -y postgresql16-server
RUN sysrc postgresql_enable=yes
RUN service postgresql initdb
RUN service postgresql start
RUN psql -c "CREATE DATABASE my_db;"
RUN service postgresql stop

and try to build it:

$ klee build -t PostgreSQL .
Started to build image with ID f472c80affa4
Step 1/7 : FROM FreeBSD:latest
Step 2/7 : RUN pkg install -y postgresql16-server
..............................
... <lots of build output> ...
..............................
--> Snapshot created: @c069e69af8cf
Step 3/7 : RUN sysrc postgresql_enable=yes
postgresql_enable:  -> yes
--> Snapshot created: @fb70baa07e6e
Step 4/7 : RUN service postgresql initdb
..............................
... <lots of build output> ...
..............................
creating directory /var/db/postgres/data16 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 20
selecting default shared_buffers ... 400kB
selecting default time zone ... UTC
creating configuration files ... ok 
running bootstrap script ... 2024-02-20 18:09:13.060 UTC [6935] FATAL:  could not create shared memory segment: Function not implemented
2024-02-20 18:09:13.060 UTC [6935] DETAIL:  Failed system call was shmget(key=55715, size=56, 03600).
child process exited with exit code 1
initdb: removing data directory "/var/db/postgres/data16"
jail: /usr/bin/env /bin/sh -c service postgresql initdb: failed
The command '/bin/sh -c service postgresql initdb' returned a non-zero code: 1
Failed to build image f472c80affa4. Most recent snapshot is @fb70baa07e6e

It failed! The last line informs us of the most recent snapshot, i.e., the snapshot taken after the last succesful COPY/RUN instruction.

Tip

Every time a snapshot is created during a build, Kleene prints a message --> Snapshot created: @<image-id> exemplified by the previous failed build output. You can create new images and containers using these image-snapshots as parent-images.

Kleene have saved the state of the failed build as an image with nametag <name-supplied>:failed. In in this example:

$ klee lsi
 ID             NAME         TAG      CREATED            
─────────────────────────────────────────────────────────
 f472c80affa4   PostgreSQL   failed   About a minute ago 
 b905ae354338   FreeBSD      latest   5 months ago 

If the tag already exists, the existing image will be untagged.

After a bit of research we discovered that PostgresSQL needs a specific kernel functionalty that is disabled for containers by default. This can be enabled using jail-parameter allow.sysvipc. We rebuild using the last vaild snapshot from our previous (failed) build, by modifying our draft Dockerfile:

FROM PostgreSQL:failed@fb70baa07e6e
#FROM FreeBSD:latest
#RUN pkg install -y postgresql16-server
#RUN sysrc postgresql_enable=yes
RUN service postgresql initdb
RUN service postgresql start
RUN psql -c "CREATE DATABASE my_db;"
RUN service postgresql stop

We use the snapshot as our parent image in FROM instruction, and comment out the instructions that worked as expected in the previous build. We rebuild with

$ klee build -J allow.sysvipc -t PostgreSQL .
......................
... <build output> ...
......................
--> Snapshot created: @0b4c07e5d8ad
Step 4/5 : RUN psql -c "CREATE DATABASE my_db;"
psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL:  role "root" does not exist
The command '/bin/sh -c psql -c "CREATE DATABASE my_db;"' returned a non-zero code: 2
Failed to build image 5db4e03a7a4e. Most recent snapshot is @0b4c07e5d8ad

The RUN service postgresql initdb now runs succesfully, but a new error occurs in RUN psql -c "CREATE DATABASE my_db;". We immediately know what this error is about (wrong user) and start an interactive container, based on the failed build, to verify our intended solution to the problem:

$ klee run -J allow.sysvipc -it PostgreSQL:failed /bin/sh
# cat /etc/passwd 
# $FreeBSD$
# 
# Lines ommited for brevity
root:*:0:0:Charlie &:/root:/bin/csh
toor:*:0:0:Bourne-again Superuser:/root:
daemon:*:1:1:Owner of many system processes:/root:/usr/sbin/nologin
tests:*:977:977:Unprivileged user for tests:/nonexistent:/usr/sbin/nologin
nobody:*:65534:65534:Unprivileged user:/nonexistent:/usr/sbin/nologin
postgres:*:770:770:PostgreSQL Daemon:/var/db/postgres:/bin/sh
# service postgresql start
2024-02-20 21:41:19.415 UTC [10012] LOG:  ending log output to stderr
2024-02-20 21:41:19.415 UTC [10012] HINT:  Future log output will go to log destination "syslog".
# su postgres
$ psql -c "CREATE DATABASE my_db;" 
CREATE DATABASE

Great! We just need to switch to the postgres user when we are using psql. We rebuild again with an updated Dockerfile containing our new solution, represented by a couple of USER instructions:

FROM PostgreSQL:failed@0b4c07e5d8ad
#FROM FreeBSD:latest
#RUN pkg install -y postgresql16-server
#RUN sysrc postgresql_enable=yes
#RUN service postgresql initdb
RUN service postgresql start
USER postgres
RUN psql -c "CREATE DATABASE my_db;"
USER root
RUN service postgresql stop

Also, we adapted the FROM-instruction in our Dockerfile to the snapshot that was taken after RUN service postgresql initdb finished. That means our build will start right after the database has been initialized.

Note that even though the nametag PostgreSQL:failed remains the same, it points to a different image. Now it refers to the latest failed build. The previous image is still visible with klee lsi but without any nametag.

We rebuild from the last snapshot and hopefully this should complete succesfully.

Finally, we create the final Dockerfile

FROM FreeBSD:latest
RUN IGNORE_OSVERSION=yes pkg install -y postgresql16-server
RUN sysrc postgresql_enable=yes
RUN service postgresql initdb
RUN service postgresql start
USER postgres
RUN psql -c "CREATE DATABASE my_db;"
USER root
RUN service postgresql stop

followed by a final build:

$ klee build -J allow.sysvipc -t PostgreSQL .
......................
... <build output> ...
......................
Step 5/9 : RUN service postgresql start
2024-02-20 22:36:48.269 UTC [11277] LOG:  ending log output to stderr
2024-02-20 22:36:48.269 UTC [11277] HINT:  Future log output will go to log destination "syslog".
--> Snapshot created: @4df41bda8331
Step 6/9 : USER postgres
Step 7/9 : RUN psql -c "CREATE DATABASE my_db;"
CREATE DATABASE
--> Snapshot created: @af986b79c969
Step 8/9 : USER root
Step 9/9 : RUN service postgresql stop
--> Snapshot created: @5bb4138c513c

image created
e64cec8b977e
$

Voila!