Ludovic Courtès July 10, 2020, 9:05 p.m. UTC | #1

Marius Bakke <marius@gnu.org> skribis:

> From: Marius Bakke <mbakke@fastmail.com>
> * website/drafts/ganeti-cluster-on-guix.md: New file.

Very nice!  There’s a couple of FIXME/TODO links that you’ll have to
address, but other than that I found it interesting and pleasant to

> +(Note: if you are looking for a way to run just a few virtual machines on
> +your local computer, you are probably better off using
> +[libvirt](https://guix.gnu.org/manual/en/guix.html#index-libvirt) or even
> +a [Childhurd](https://guix.gnu.org/manual/devel/en/guix.html#index-hurd_002dvm_002dservice_002dtype), as Ganeti is fairly heavyweight and requires a complicated networking
> +setup.)

Thumbs up for the Hurd plug.  :-)


+The latest addition to Guix's ever-growing list of services is a little-known
+virtualization toolkit called [Ganeti](http://www.ganeti.org/).  Ganeti is
+designed to keep virtual machines running on a cluster of servers even in the
+event of hardware failures, and to make maintenance and recovery tasks easy.
+It is comparable to tools such as
+[Proxmox](https://www.proxmox.com/en/proxmox-ve) or
+[oVirt](https://www.ovirt.org/), but has some distinctive features.  One is
+that there is no GUI: [third](https://github.com/osuosl/ganeti_webmgr)
+[ones](https://github.com/sipgate/ganeti-control-center) exist, but are not
+currently packaged in Guix, so you are left with a rich command-line client
+and a fully featured
+[remote API](http://docs.ganeti.org/ganeti/master/html/rapi.html).
+Another interesting feature is that installing Ganeti on its own leaves you
+no way to actually deploy any virtual machines.  That probably sounds crazy,
+but stems from the fact that Ganeti is designed to be API-driven and automated,
+thus it comes with a
+[OS API](http://docs.ganeti.org/ganeti/master/html/man-ganeti-os-interface.html)
+and users need to install one or more *OS providers* in addition to Ganeti.
+OS providers offer a declarative way to deploy virtual machine variants and
+should feel natural to Guix users.  At the time of writing, the providers
+available in Guix are [debootstrap](https://github.com/ganeti/instance-debootstrap)
+for provisioning Debian- and Ubuntu-based VMs, and of course a
+[Guix](https://github.com/mbakke/ganeti-instance-guix) provider.
+Finally Ganeti comes with a sophisticated scheduler that efficiently packs
+virtual machines across a cluster while maintaining N+1 redundancy in case
+of a failover scenario.  It can also make informed scheduling decisions
+based on various cluster tags, such as ensuring primary and secondary nodes
+are on different power distribution lines.
+(Note: if you are looking for a way to run just a few virtual machines on
+your local computer, you are probably better off using
+[libvirt](https://guix.gnu.org/manual/en/guix.html#index-libvirt) or even
+a [Childhurd](https://guix.gnu.org/manual/devel/en/guix.html#index-hurd_002dvm_002dservice_002dtype), as Ganeti is fairly heavyweight and requires a complicated networking
+# Preparing the configuration
+With introductions out of the way, let's see how we can deploy a Ganeti
+cluster using Guix.  For this tutorial we will create a two-node cluster
+and connect instances to the local network using an
+[Open vSwitch](https://www.openvswitch.org/) bridge with no VLANs.  We assume
+that each node has a single network interface named `eth0` connected to the
+same network, and that a dedicated partition `/dev/sdz3` is available for
+virtual machine storage.  It is possible to store VMs on a number of other
+storage backends, but a dedicated drive (or rather LVM volume group) is
+necessary to use the [DRBD](https://www.linbit.com/drbd/) integration to
+replicate VM disks.
+We'll start off by defining a few helper services to create the Open vSwitch
+bridge and ensure the physical network interface is in the "up" state.  Since
+Open vSwich stores the configuration in a database, you might as well run the
+equivalent `ovs-vsctl` commands on the host once and be done with it, but we
+do it through the configuration system to ensure we don't forget it in the
+future when adding or reinstalling nodes.
+(define (start-interface if)
+  #~(let ((ip (string-append #$iproute "/sbin/ip")))
+      (invoke/quiet ip "link" "set" #$if "up")))
+(define (stop-interface if)
+  #~(let ((ip (string-append #$iproute "/sbin/ip")))
+      (invoke/quiet ip "link" "set" #$if "down")))
+;; This service is necessary to ensure eth0 is in the "up" state on boot
+;; since it is otherwise unmanaged from Guix PoV.
+(define (ifup-service if)
+  (let ((name (string-append "ifup-" if)))
+    (simple-service name shepherd-root-service-type
+                    (list (shepherd-service
+                           (provision (list (string->symbol name)))
+                           (start #~(lambda ()
+                                      #$(start-interface if)))
+                           (stop #~(lambda ()
+                                     #$(stop-interface if)))
+                           (respawn? #f))))))
+(define* (create-openvswitch-bridge bridge uplink
+                                    #:key (vlan-mode #f))
+  #~(let ((ovs-vsctl (lambda (cmd)
+                       (apply invoke/quiet
+                              #$(file-append openvswitch "/bin/ovs-vsctl")
+                              (string-tokenize cmd)))))
+      (and (ovs-vsctl (string-append "--may-exist add-br " #$bridge))
+           (ovs-vsctl (string-append "--may-exist add-port " #$bridge " "
+                                     #$uplink
+                                     (if #$vlan_mode
+                                         (format #f " vlan_mode=~a " #$vlan-mode)
+                                         ""))))))
+(define* (create-openvswitch-internal-port bridge port
+                                           #:key (vlan-mode #f))
+  #~(invoke/quiet #$(file-append openvswitch "/bin/ovs-vsctl")
+                  "--may-exist" "add-port" #$bridge #$port
+                  (if #$vlan_mode
+                      (string-append "vlan_mode=" #$vlan-mode)
+                      "")
+                  "--" "set" "Interface" #$port "type=internal"))
+(define %openvswitch-configuration-service
+  (simple-service 'openvswitch-configuration shepherd-root-service-type
+                  (list (shepherd-service
+                         (provision '(openvswitch-configuration))
+                         (requirement '(vswitchd))
+                         (start #~(lambda ()
+                                    #$(create-openvswitch-bridge
+                                       "br0" "eth0"
+                                       #:vlan_mode "native-untagged")
+                                    #$(create-openvswitch-internal-port
+                                       "br0" "gnt0"
+                                       #:vlan_mode "native-untagged")))
+                         (respawn? #f)))))
+This defines a `openvswitch-configuration` service object that creates a
+logical switch `br0`, connects `eth0` as the "uplink", and creates a logical
+port `gnt0` that we will use later as the main network interface for this
+system.  We also create an `ifup` service that can bring network interfaces
+up and down.  By themselves these variables do nothing, we also have to add
+them to our `operating-system` configuration below.
+A configuration like this might be suitable for a small home network.  In most
+"real world" deployments you would use tagged VLANs, and maybe a traditional
+Linux bridge instead of Open vSwitch.  You can also forego bridging altogether
+with a `routed` networking setup, or do any combination of the three.
+With this in place, we can start creating the `operating-system` configuration
+that we will use for the Ganeti servers:
+  (host-name "node1")
+  [...]
+  ;; Ganeti requires that each node and the cluster address resolves to an
+  ;; IP address.  The easiest way to achieve this is by adding everything
+  ;; to the hosts file.
+  (hosts-file (plain-file "hosts" (format #f "\
+       localhost
+::1             localhost
+   node1
+   node2
+   ganeti.lan
+  (kernel-arguments
+   (append %default-kernel-arguments
+           '(;; Disable DRBDs usermode helper, as Ganeti
+             ;; is the only thing that should manage DRBD.
+             "drbd.usermode_helper=/run/current-system/profile/bin/true")))
+  (packages (append (map specification->package
+                         '("qemu" "drbd-utils" "lvm2"
+                           "ganeti-instance-guix"
+                           "ganeti-instance-debootstrap"))
+                     %base-packages))
+  (services (cons* (service ganeti-service-type
+                            (ganeti-configuration
+                             (file-storage-paths '("/srv/ganeti/file-storage"))
+                             (os
+                              (list (ganeti-os
+                                     (name "debootstrap")
+                                     (variants
+                                      (list (debootstrap-variant
+                                             "buster"
+                                             (debootstrap-configuration
+                                              (hooks
+                                               (local-file
+                                                "debootstrap-hooks"
+                                                #:recursive? #t))))
+                                            (debootstrap-variant
+                                             "testing+contrib"
+                                             (debootstrap-configuration
+                                              (suite "testing")
+                                              (components '("main" "contrib")))))))))))
+                    ;; Create a static IP on the "gnt0" Open vSwitch interface.
+                   (service openvswitch-service-type)
+                   %openvswitch-configuration-service
+                   (ifup-service "eth0")
+                   (static-networking-service "gnt0" ""
+                                              #:netmask ""
+                                              #:gateway ""
+                                              #:requirement '(openvswitch-configuration)
+                                              #:name-servers '(""))
+                   ;; Ganeti needs SSH to communicate between nodes.
+                   (service openssh-service-type
+                            (openssh-configuration
+                             (permit-root-login 'without-password)))
+                   %base-services)))
+Debootstrap variants rely on a set of scripts (known as "hooks") in the
+installation process to do things like configure networking, install bootloader,
+create users, etc.  In the example above, the "buster" variant will use a local
+directory next to the configuration file named "debootstrap-hooks" (it is copied
+into the final system closure), whereas the "testing+contrib" variant has no hooks
+defined and will use `/etc/ganeti/instance-debootstrap/hooks` if it exists.
+Ganeti veterans may be surprised that each OS variant has its own hooks.  All
+Ganeti clusters I know of use a single set of hooks for all variants, sometimes
+with additional logic inside the script based on the variant.  Guix offers a
+powerful abstraction that makes it trivial to create per-variant hooks, obsoleting
+the need for a big `/etc/ganeti/instance-debootstrap/hooks` directory.  Of course
+you can still create it using `extra-special-file` and leave the `hooks` property
+of the variants as `#f`.
+Not all Ganeti options are exposed in the configuration system yet.  If you
+find it limiting, you can add custom files using `extra-special-file`, or
+ideally extend the `<ganeti-configuration>` data type to suite your needs.
+Of course you can use `gnt-cluster copyfile` and `gnt-cluster command`
+to distribute files or run executables, but beware that undeclared changes
+in `/etc` may be lost on the next reboot or reconfigure.
+# Initializing a cluster
+At this stage, you should run `guix system reconfigure` with the new
+configuration on all nodes that will participate in the cluster.  If you
+do this over SSH or with
+[guix deploy](https://guix.gnu.org/blog/2019/managing-servers-with-gnu-guix-a-tutorial/),
+beware that `eth0` will lose network connectivity once it is "plugged in to"
+the virtual switch, and you need to add any IP configuration to `gnt0`.
+The Guix configuration system does not currently support declaring LVM
+volume groups, so we will create these manually on each node.  We could
+write our own declarative configuration like the `ifup-service`, but for
+brevity and safety reasons we'll do it "by hand":
+pvcreate /dev/sdz3
+vgcreate ganetivg /dev/sdz3
+On the node that will act as the "master node", run the init command:
+gnt-cluster init \
+    --master-netdev=gnt0 \
+    --vg-name=ganetivg \
+    --enabled-disk-templates=file,plain,drbd \
+    --drbd-usermode-helper=/run/current-system/profile/bin/true \
+    --enabled-hypervisors=kvm \
+    --no-etc-hosts \
+    --no-ssh-init \
+    ganeti.lan
+If you are okay with Ganeti taking control over SSH `authorized_keys` and
+`known_hosts`, remove the `--no-ssh-init` option.  Guix users might prefer
+to manage the relevant files using `openssh-configuration`.  All nodes in
+the cluster must be able to reach each other over SSH as the root user.
+Similarly, Ganeti can update the `/etc/hosts` file when nodes are added or
+removed, but it makes little sense on Guix as it is recreated every reboot.
+If all goes well, the command returns no output and you should have the
+`ganeti.lan` IP address visible on `gnt0`.  You can run `gnt-cluster verify`
+to check that the cluster is in good shape.  Most likely it complains about
+Use `gnt-cluster modify` to change the running state of the cluster:
+gnt-cluster modify -H kvm:kernel_path=
+The command above removes the warning about the default KVM kernel being
+missing, making `gnt-cluster verify` happy.  For this tutorial we only use
+fully virtualized instances, but users might want to set `kernel_path` to a
+suitable VM kernel.
+Now let's add our other machine to the cluster:
+gnt-node add node2
+Ganeti will log into the node, copy the cluster configuration and start the
+relevant Shepherd services.  No output means the command succeeded.  Run
+`gnt-cluster verify` again to check that everything is in order:
+gnt-cluster verify
+If you get warnings about SSH authorizations here, you should fix those
+before proceeding.  If you used `--no-ssh-init` earlier you may need to
+update `/var/lib/ganeti/known_hosts` with the new node information, either
+with `gnt-cluster copyfile` or by adding it to the OS configuration.
+The above configuration will make three operating systems available:
+# gnt-os list
+Let's try them out.  But first we'll make Ganeti aware of our network
+so it can choose a static IP for the virtual machines.
+# gnt-network add --network= --gateway= lan
+# gnt-network connect -N mode=openvswitch,link=br0 lan
+Now we can add an instance:
+gnt-instance add --no-name-check --no-ip-check -o debootstrap+buster \
+    -t drbd --disk 0:size=5G  -B memory=256m,vcpus=2 \
+    --net 0:network=lan,ip=pool bustervm1
+Ganeti will automatically select the optimal primary and secondary node
+for this VM based on available cluster resources.  You can manually
+specify primary and secondary nodes with the `-n` and `-s` options.
+By default Ganeti assumes that the new instance is already configured in DNS,
+so we need `--no-name-check` and `--no-ip-check` to bypass some sanity tests.
+Try adding another instance, now using the Guix OS provider:
+gnt-instance add --no-name-check --no-ip-check -o guix \
+    -t plain --disk 0:size=5G -B memory=1G,vcpus=4 \
+    --net 0:network=lan,ip=pool guix1
+The Guix OS has a built-in configuration that starts an SSH server and authorizes
+the hosts SSH key, and configures static networking based on information from
+Ganeti.  It is possible to specify a custom configuration file, and even a
+specific Guix commit:
+gnt-instance add --no-name-check --no-ip-check -o guix \
+    -t file --file-storage-dir=/srv/ganeti/file-storage \
+    --disk 0:size=20G -B memory=4G,vpus=3 \
+    --net 0:network=lan,ip=pool \
+    -O "config=$(base64 /the/config/file.scm),commit=<commit>" \
+    custom-guix
+That's it for this tutorial!  If you are new to Ganeti, you should
+familiarize yourself with the `gnt-` family commands.  Fun stuff to
+do include `gnt-instance migrate` to move VMs between hosts,
+`gnt-node evacuate` to migrate _all_ VMs off a node, and
+`gnt-cluster master-failover` to move the master role to a different node.
+# Final remarks
+Like most services in Guix, Ganeti comes with a
+[system test](https://guix.gnu.org/blog/2016/guixsd-system-tests/)
+that [runs in a VM](FIXME) and ensures that things like initializing a cluster
+work.  The continuous integration system
+[runs this automatically](https://ci.guix.gnu.org/search?query=ganeti), and
+users can run it locally with `make check-system TESTS=ganeti`.  Such
+tests give us confidence that both the package and configuration system work,
+and allows rapid testing of the configuration API.  Currently it does little
+more than `gnt-cluster verify`, but it can be extended to provision a real
+cluster inside Ganeti and try things like live migration.
+The author had a lot of fun creating
+[native data types](FIXME manual link)
+in the Guix configuration system for the Ganeti OS specification.  The API
+went through at least three major revisions during the writing of this blog
+post.  There is still room for improvement, but I decided I had to stop
+tweaking it and instead focus on shipping the thing.  Feedback welcome!
+Having OS support in the configuration system lets us benefit from Guix's
+provenance tracking and we can easily `guix system roll-back` any breaking
+changes.  Ganeti is usually coupled with tools such as Puppet or SaltStack to
+keep things in sync between nodes, but that should not be necessary here.
+So far only the `KVM` hypervisor has been tested.  If you use LXC or Xen with
+Ganeti, please reach out to `guix-devel@gnu.org` and share your experience.
