Andreas Sommer – I'm a software engineer – Blog Writing about research and programming. en-us Sun, 22 Apr 2018 00:00:00 -0000 Andreas Sommer – I'm a software engineer – Blog 144 144 Sun, 22 Apr 2018 00:00:00 -0000 Setting up buildbot in FreeBSD jails <div id="preamble"> <div class="sectionbody"> <div class="paragraph"> <p>In this article, I would like to present a tutorial to set up <a href="">buildbot</a>, a continuous integration (CI) software (like Jenkins, drone, etc.), making use of FreeBSD&#8217;s containerization mechanism "jails". We will cover terminology, rationale for using both buildbot and jails together, and installation steps. At the end, you will have a working buildbot instance using its sample build configuration, ready to play around with your own CI plans (or even CD, it&#8217;s very flexible!). Some hints for production-grade installations are given, but the tutorial steps are meant for a test environment (namely a virtual machine). Buildbot&#8217;s configuration and detailed concepts are not in scope here.</p> </div> <h2 id="_table_of_contents" class="discrete dog-blog-breakpoint">Table of contents</h2> <div id="toc" class="toc"> <div id="toctitle" class="title"></div> <ul class="sectlevel1"> <li><a href="#_choosing_host_operating_system_and_version_for_buildbot">Choosing host operating system and version for buildbot</a></li> <li><a href="#_create_a_freebsd_playground">Create a FreeBSD playground</a></li> <li><a href="#_introduction_to_jails">Introduction to jails</a></li> <li><a href="#_overview_of_buildbot">Overview of buildbot</a></li> <li><a href="#_set_up_jails">Set up jails</a></li> <li><a href="#_install_buildbot_master">Install buildbot master</a></li> <li><a href="#_run_buildbot_master">Run buildbot master</a></li> <li><a href="#_install_buildbot_worker">Install buildbot worker</a></li> <li><a href="#_run_buildbot_worker">Run buildbot worker</a></li> <li><a href="#_set_up_web_server_nginx_to_access_buildbot_ui">Set up web server nginx to access buildbot UI</a></li> <li><a href="#_run_your_first_build">Run your first build</a></li> <li><a href="#_production_hints">Production hints</a></li> <li><a href="#_finished">Finished!</a></li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_choosing_host_operating_system_and_version_for_buildbot">Choosing host operating system and version for buildbot</h2> <div class="sectionbody"> <div class="paragraph"> <p>We choose the released version of FreeBSD (<code>11.1-RELEASE</code> at the moment). There is no particular reason for it, and as a matter of fact buildbot as a Python-based server is very cross-platform; therefore the underlying OS platform and version should not make a large difference.</p> </div> <div class="paragraph"> <p>It will make a difference for what you do with buildbot, however. For instance, <a href="">poudriere</a> is the de-facto standard for building packages from source on FreeBSD. Builds run in jails which may be any FreeBSD base system version older or equal to the host&#8217;s version (reason will be explained below). In other words, if the host is FreeBSD 11.1, build jails created by poudriere could e.g. use 9.1, 10.3, 11.0, 11.1, but potentially not version 12 or newer because of incompatibilities with the host&#8217;s kernel (jails do not run their own kernel as full virtual machines do). To not prolong this article over the intended scope, the details of which nice things could be done or automated with buildbot are not covered.</p> </div> <div class="paragraph"> <p>Package names on the FreeBSD platform are independent of the OS version, since external software (as in: not part of base system) is maintained in <a href="">FreeBSD ports</a>. So, if your chosen FreeBSD version (here: 11) is still <a href="">officially supported</a>, the packages mentioned in this post should work. In the unlikely event of package name changes before you read this article, you should be able to find the actual package names like <code>pkg search buildbot</code>.</p> </div> <div class="paragraph"> <p>Other operating systems like the various Linux distributions will use different package names but might also offer buildbot pre-packaged. If not, the <a href="">buildbot installation manual</a> offers steps to install it manually. In such case, the downside is that you will have to maintain and update the buildbot modules outside the stability and (semi-)automatic updates of your OS packages.</p> </div> </div> </div> <div class="sect1"> <h2 id="_create_a_freebsd_playground">Create a FreeBSD playground</h2> <div class="sectionbody"> <div class="paragraph"> <p><a href="">Vagrant</a> is a popular tool to quickly set up virtual machines from pre-built images. We are using it here for simplicity. Any form of test environment or virtual machine would suffice. If you choose to follow along using Vagrant, please install it and ensure you have a compatible hypervisor installed as well in order to run a virtual machine (for instance <a href="">VirtualBox</a>).</p> </div> <div class="paragraph"> <p>Official and nightly <a href="">FreeBSD images for Vagrant</a> are available. With the following commands, we create a new directory for the playground virtual machine (called "VM" from here on) and then use Vagrant to download the FreeBSD 11.1-RELEASE image. Ensure you have enough disk space: the image presented here has around 1.4 GB, and you additionally need to allocate space for the VM.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">mkdir -p ~/vagrant/freebsd-11.1-buildbot cd ~/vagrant/freebsd-11.1-buildbot vagrant init freebsd/FreeBSD-11.1-RELEASE</code></pre> </div> </div> <div class="paragraph"> <p>After <code>vagrant init</code>, the image is available to create new VMs and a <code>Vagrantfile</code> was created in the current directory. We must edit the file, because the metadata (contained in what Vagrant calls a "box" = disk image + metadata) is missing two pieces of information: base MAC address and shell (see <a href="">bug report</a>). Vagrant&#8217;s default shell is <code>bash -l</code>, but FreeBSD does not ship bash in its base system; hence we use <code>sh</code>. Also, we will disable synced folders as we will not need them here and they do not work out of the box (literally!). Without the commented sample configurations, the file should look as follows:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-ruby" data-lang="ruby">Vagrant.configure("2") do |config| = "freebsd/FreeBSD-11.1-RELEASE" = "/bin/sh" config.vm.base_mac = "080027D14C66" config.vm.synced_folder ".", "/vagrant", disabled: true "forwarded_port", guest: 80, host: 8999 end</code></pre> </div> </div> <div class="paragraph"> <p>Now let&#8217;s provision the virtual machine:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">vagrant up</code></pre> </div> </div> <div class="paragraph"> <p>If you see messages like <code>Warning: Connection reset. Retrying&#8230;&#8203;</code> for a while, keep hanging on&#8201;&#8212;&#8201;the official FreeBSD image defaults to connect to the Internet on first startup in order to fetch and install the latest updates. This can take a few minutes and several VM reboots.</p> </div> <div class="paragraph"> <p>Once the VM has fully booted, we can drop into a terminal via SSH. Vagrant handles the connection details for us:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">vagrant ssh</code></pre> </div> </div> <div class="paragraph"> <p>Remember we set <code>/bin/sh</code> as shell in the <code>Vagrantfile</code>? Confusingly, Vagrant 2.0.3 needs this setting to work (else fails while bringing up the virtual machine), but now totally ignores the setting and we find ourselves in <code>csh</code>, the default configured for the connecting user account πŸ™„. You can recognize it from its default <code>vagrant@freebsd:~ % </code> shell prompt (sh uses <code>$ </code> without extra information), or type <code>ps -p []$$</code> to show details about the shell itself (where <code>$$</code> resolves to the shell process ID in all popular shells). If you are more familiar with a different shell, you could for example install and use bash like so: <code>sudo pkg install bash &amp;&amp; chsh &amp;&amp; sudo chsh</code>. If you decide to stick to the default terminal <code>csh</code>, ensure you do not copy-and-paste example shell command lines starting with <code>#</code>, as those are <em>not</em> interpreted as comments in interactive csh shells.</p> </div> </div> </div> <div class="sect1"> <h2 id="_introduction_to_jails">Introduction to jails</h2> <div class="sectionbody"> <div class="paragraph"> <p>FreeBSD has been supporting the concept of jails since the start of its 4.x release series in the year 2000. This is way before its modern competitors LXC/Docker/rkt and&#8201;&#8212;&#8201;like most other mechanisms&#8201;&#8212;&#8201;OS-specific. Some people say that jails are more mature. Since I have not worked with any Linux container mechanisms after OpenVZ many years back, I cannot give any experience or comparison here, and in any case it would probably be apples vs. pears; I like pears when they lay around a little and got soft.</p> </div> <div class="paragraph"> <p>Jails work like a full FreeBSD environment, but access to the outer system&#8217;s resources is restricted. For example, a jail may only listen on a network interface and IP address that was assigned to it. Filesystem access and other permissions like mounting of filesystems is (configurably) limited, as well (similar to a <a href="">chroot environment</a>). The performance difference of running software in a jail vs. directly on the jailhost is usually not noticeable (somewhat related study: <a href="">packet routing performance analysis</a> by Olivier Cochard-LabbΓ© at EuroBSDcon 2017).</p> </div> <div class="paragraph"> <p>No other operating systems like Linux or Windows can be run in a jail, because the kernel is shared among jailhost (this is what I will call the outer operating system in this article) and all jails. For the same reason, running e.g. FreeBSD 12 in a jail&#8201;&#8212;&#8201;while the host is still on FreeBSD 11&#8201;&#8212;&#8201;might not work because software built for the newer OS version may expect a different kernel interface and crash if run with the older kernel.</p> </div> </div> </div> <div class="sect1"> <h2 id="_overview_of_buildbot">Overview of buildbot</h2> <div class="sectionbody"> <div class="paragraph"> <p>Buildbot is a very versatile software. While I mentioned its main use as CI (Continuous Integration) and probably even CD (Continuous Delivery/Deployment) platform, it could theoretically do <em>any</em> automated task that runs on a computer. It&#8217;s just so that the "batteries included" are mostly related to building software. If you need something else, you can easily write build steps and other things in your Python-based master configuration file.</p> </div> <div class="paragraph"> <p>The main components to understand are the <strong>buildbot master</strong> and <strong>buildbot worker</strong>:</p> </div> <div class="ulist"> <ul> <li> <p><strong>buildbot master</strong>: component which parses all build configuration and other settings (notification e-mails, change sources such as Git repositories, when builds are triggered/scheduled, etc.) and distributes the actual builds to its workers.</p> </li> <li> <p><strong>buildbot worker</strong>: a dumb component which only has connection details as configuration and gets all other commands from the master, namely to run builds. There could be multiple, and in large production setups, it makes a lot of sense to put them onto powerful, separate servers. Ephemeral workers (buildbot calls them "latent workers"), i.e. dynamically created and destroyed instances, are another option and support for several cloud providers and hypervisors is included. In this article, we will start small and set up a single, jailed worker which may be enough for your first steps with buildbot. You can later easily add/move workers somewhere else if you see the need.</p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_set_up_jails">Set up jails</h2> <div class="sectionbody"> <div class="paragraph"> <p>Jails are a cheap way to semantically (and security-wise) separate applications or groups of them. If we later want to move the buildbot worker component or clone it, it is easiest to have the worker&#8201;&#8212;&#8201;and nothing else&#8201;&#8212;&#8201;in a jail.</p> </div> <div class="paragraph"> <p>We begin by installing <code>ezjail</code>, a very popular and stable wrapper around FreeBSD&#8217;s <a href="">jail</a> functionality. It makes creation and administration of jails much easier.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">sudo pkg install ezjail # Create directory structure and "base jail" i.e. extract base # FreeBSD system to /usr/jails/basejail sudo ezjail-admin install</code></pre> </div> </div> <div class="paragraph"> <p>Now it&#8217;s time to actually create the jails. Since the master offers a web UI and the worker talks to the master, both need IP addresses assigned. For simplicity, we choose local-only addresses here (network</p> </div> <div class="paragraph"> <p>Jail networking has several gotchas, one of them being how loopback addresses are handled: namely, when accessing the IP addresses <code></code> and <code>::1</code> inside the jail, the connection does not end up on the jailhost&#8217;s loopback interface (else jails could access its parent&#8217;s services&#8201;&#8212;&#8201;a security hole), but the kernel rewrites those connections to the first IPv4/IPv6 address assigned to the jail. If the first assigned IP address is public and a service in the jail listens on <code></code>, port 1234 will suddenly be publically accessible! Therefore, the <a href="">recommended practice</a> is to have a separate network interface for jails (you could even have one per jail, but in this tutorial we want the jails to communicate with each other directly). This works by "cloning" lo0 into the new interface lo1.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># Add a separate network interface for jails and create the cloned # interface (automatically happens at next boot as well, no need # to repeat these steps) sudo sysrc cloned_interfaces+=lo1 sudo service netif cloneup # We can assign an IP to the server ("jailhost") as well. Needed in # this tutorial so jailhost and jails can communicate (we will # serve buildbot's web user interface with nginx later). sudo sysrc ifconfig_lo1="inet netmask" # Set default network interface for jails (if not explicitly configured) sudo sysrc jail_interface=lo1 # Start ezjail's configured jails on boot sudo sysrc ezjail_enable=YES # Actually create our jails sudo ezjail-admin create -f example master "" sudo ezjail-admin create -f example worker0 "" # Start all ezjail-managed jails (will also happen on reboot because # of ezjail_enable=YES). Please ignore the warning # "Per-jail configuration via jail_* variables is obsolete" - ezjail # simply has not been changed yet to use another mechanism. sudo ezjail-admin start</code></pre> </div> </div> <div class="paragraph"> <p>The jails have successfully started, but to do something useful&#8201;&#8212;&#8201;like installing packages inside&#8201;&#8212;&#8201;we want Internet access from within the jails (at least if you decide to use the official source For that purpose, we set up a NAT networking rule using one of FreeBSD&#8217;s built-in firewalls (or rather: package filters), <code>pf</code>.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">sudo tee /etc/pf.conf &lt;&lt;EOF ext_if = "em0" # external network interface, adapt to your hardware/network if needed jail_if = "lo1" # the interface we chose for communication between jails # Allow jails to access Internet via NAT, but avoid NAT within same network so jails can # communicate with each other no nat on \$ext_if from (\$jail_if:network) to (\$jail_if:network) nat on \$ext_if from (\$jail_if:network) to any -&gt; \$ext_if # Note: above two rules split for clarity -&gt; equivalent to this one-liner: # nat on \$ext_if from (\$jail_if:network) to ! (\$jail_if:network) -&gt; \$ext_if # No restrictions on jail network set skip on \$jail_if # Common recommended pf rules, not exactly related to this article set skip on lo0 block drop in pass out on \$ext_if # Don't lock ourselves out from SSH pass in on \$ext_if proto tcp to \$ext_if port 22 # Allow web access pass in on \$ext_if proto tcp to \$ext_if port 80 EOF # Check firewall rules syntax sudo service pf onecheck sudo sysrc pf_enable=YES sudo service pf start</code></pre> </div> </div> <div class="paragraph"> <p>(mind that <code>$</code> must be escaped in shells and will land in /etc/pf.conf unescaped)</p> </div> <div class="paragraph"> <p>At this point, your SSH connection will stall (and drop after some time) because the firewall does not have a state of your existing connection. To drop out from the hanging terminal, press <code>Enter, ~, .</code> one after another. To understand how this keyboard shortcut closes the SSH session, please read up about escape characters in the <a href=";sektion=1#ESCAPE_CHARACTERS">ssh manpage</a>. Now, please reconnect to the VM with <code>vagrant ssh</code>.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># Check if Internet connection works at all fetch -o - # Copy resolv.conf to every jail to allow resolving hostnames # (note: typically added to your default ezjail flavor) sudo tee /usr/jails/master/etc/resolv.conf &lt; /etc/resolv.conf sudo tee /usr/jails/worker0/etc/resolv.conf &lt; /etc/resolv.conf # Check if Internet connection works from a jail sudo jexec master fetch -o -</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="_install_buildbot_master">Install buildbot master</h2> <div class="sectionbody"> <div class="paragraph"> <p>Apart from the master, we want to install the web user interface (called "UI" hereinafter) and Git since that is used in buildbot&#8217;s sample configuration for fetching a source project (the smaller package <code>git-lite</code> should be enough for fetching of most typical schemes like ssh and https).</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">sudo pkg -j master install git-lite py36-buildbot py36-buildbot-www # Alternative which requires installing the tool package manager `pkg` # itself inside jail: # sudo jexec master pkg install git-lite py36-buildbot py36-buildbot-www</code></pre> </div> </div> <div class="paragraph"> <p>We create a regular, unprivileged user to run the buildbot master:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># Open a shell inside jail sudo jexec master sh # Instead of pw, you can use the interactive command `adduser`. We use a # random password to protect the account. Since we are always root when # doing `jexec` into a jail, we can become the user without entering the # password and therefore can forget which password was automatically generated. pw useradd -n buildbot-master -m -w random # Create directory for master mkdir /var/buildbot-master chown buildbot-master:buildbot-master /var/buildbot-master # Become unprivileged user su -l buildbot-master buildbot create-master /var/buildbot-master cp /var/buildbot-master/master.cfg.sample /var/buildbot-master/master.cfg # Switch to root user again (we did `su -l buildbot-master` earlier) exit</code></pre> </div> </div> <div class="paragraph"> <p>The sample configuration polls a "Hello world" project every few minutes and builds it on changes. Nothing very interesting here, but it explains the principles quite well.</p> </div> <div class="paragraph"> <p>Time to do configure something useful, right? Not so fast! Without a worker, no build could run. For now, we copied the sample configuration to get started. In the next steps, we permanently run the master and set up a worker to actually run the builds.</p> </div> </div> </div> <div class="sect1"> <h2 id="_run_buildbot_master">Run buildbot master</h2> <div class="sectionbody"> <div class="paragraph"> <p>The built-in mechanism for running buildbot is simply <code>buildbot start</code>. Since this starts the master only once, we opt for a permanent solution to start on boot. The package maintainers have thought of this and provide an rc script (such scripts manage service start, stop and other subcommands like restart/reload). It can be executed at boot (or more exactly in this tutorial: when the jail is started) to bring up the service. For that to happen, we only have to enable the service permanently and specify its working directory and user:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># Still inside jail shell sysrc buildbot_enable=YES sysrc buildbot_basedir=/var/buildbot-master sysrc buildbot_user=buildbot-master service buildbot start # Check log file if you wish tail /var/buildbot-master/twistd.log</code></pre> </div> </div> <div class="paragraph"> <p>If you are interested how the rc script starts and stops the service, check its code at <code>/usr/local/etc/rc.d/buildbot</code>.</p> </div> </div> </div> <div class="sect1"> <h2 id="_install_buildbot_worker">Install buildbot worker</h2> <div class="sectionbody"> <div class="paragraph"> <p>If you are still in the buildbot master jail&#8217;s shell, drop out with <code>exit</code>, or alternatively create a new session to the jailhost with <code>vagrant ssh</code>.</p> </div> <div class="paragraph"> <p>Like for the master, we first install required packages and then create an unprivileged user. Watch out to not mistype <code>buildbot-master</code> for <code>buildbot-worker</code>&#8201;&#8212;&#8201;below, we will only execute commands related to the worker. Git is used in the example builder to fetch the source code for the build. Not to be confused with the <code>GitPoller</code> on the master which is a "change source" i.e. regularly checks if changes exist in a repository; therefore we need Git on both master and worker for our example usage.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">sudo pkg -j worker0 install git-lite py36-buildbot-worker # Alternative which requires installing the tool package manager `pkg` # itself inside jail: # sudo jexec worker0 pkg install git-lite py36-buildbot-worker # Open a shell inside jail sudo jexec worker0 sh # Instead of pw, you can use the interactive command `adduser`. We use a # random password to protect the account. Since we are always root when # doing `jexec` into a jail, we can become the user without entering the # password and therefore can forget which password was automatically generated. pw useradd -n buildbot-worker -m -w random # Create directory for worker mkdir /var/buildbot-worker chown buildbot-worker:buildbot-worker /var/buildbot-worker # Become unprivileged user su -l buildbot-worker buildbot-worker create-worker /var/buildbot-worker example-worker pass # The output told us do perform some actions manually. Let's obey: cd /var/buildbot-worker # Please fill in yourself or the admin echo "Your Name &lt;;" &gt; info/admin # Worker description for display in UI echo "worker0" &gt; info/host # Switch to root user again (we did `su -l buildbot-master` earlier) exit</code></pre> </div> </div> <div class="admonitionblock note"> <table> <tr> <td class="icon"> <div class="title">Note</div> </td> <td class="content"> <div class="paragraph"> <p>Buildbot workers were previously called "slaves" and due to the politically unsound meaning, <a href="">Mozilla assigned a $15000 contribution</a> to take care of the rename, which went from documentation all the way down to source code and package names. So luckily, I do not have to write about a "slave in a jail" here πŸ‘.</p> </div> </td> </tr> </table> </div> </div> </div> <div class="sect1"> <h2 id="_run_buildbot_worker">Run buildbot worker</h2> <div class="sectionbody"> <div class="paragraph"> <p>We are lucky: buildbot workers do not need any configuration other than the connection details because the master handles all logic. Workers are "dumb" and only perform builds locally, reporting progress and results back to the master over the connection we specified (worker connects to master at IP using default port 9989). Most extensibility of buildbot is in the master (and its <code>master.cfg</code> file). However, flexibility for your actual build purposes is in the workers as well, since you have the freedom to choose a different operating system, configuration and installed software for each worker. Since we work with FreeBSD jails in this tutorial, we are "restricted" to the jailhost&#8217;s FreeBSD kernel, but can freely choose any base system and extra packages for the worker as long as the OS release version is not newer than the host (as mentioned in the introduction).</p> </div> <div class="paragraph"> <p>Similar to the buildbot master rc script, you will probably want to run the worker permanently:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># Still inside jail shell sysrc buildbot_worker_enable=YES sysrc buildbot_worker_basedir=/var/buildbot-worker sysrc buildbot_worker_uid=buildbot-worker sysrc buildbot_worker_gid=buildbot-worker service buildbot-worker start # if it fails with "cannot run /usr/local/bin/twistd", apply this patch from # to the file # `/usr/local/etc/rc.d/buildbot-worker` and try again: # sed -i '' 's|command="/usr/local/bin/twistd"|command="/usr/local/bin/twistd-3.6"|' /usr/local/etc/rc.d/buildbot-worker # Check log file, should show a message "Connected to; worker is ready" tail /var/buildbot-worker/twistd.log # Back to jailhost shell exit</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="_set_up_web_server_nginx_to_access_buildbot_ui">Set up web server nginx to access buildbot UI</h2> <div class="sectionbody"> <div class="paragraph"> <p>Master and worker have been set up, and if you watch log files, activity will be visible:</p> </div> <div class="listingblock"> <div class="content"> <pre># On jailhost $ tail -F /usr/jails/*/var/buildbot*/twistd.log [...] 2018-04-21 17:23:28+0000 [-] gitpoller: processing changes from "git://"</pre> </div> </div> <div class="paragraph"> <p>Here, "processing changes" means that if a change was detected from the previous build, a new build will be triggered. The change source is explicitly connected to trigger a build in the sample configuration&#8201;&#8212;&#8201;no builds are triggered <em>implicitly</em> only because there is a Git change source; the configuration does only and exactly what you code into it πŸ’ͺ.</p> </div> <div class="paragraph"> <p>There is of course no reason to look into log files to see which build is running. Buildbot features a web-based UI to give an overview, see results, force-trigger builds and more. In the sample master configuration, the <code>www</code> component is already set up to serve HTTP on port 8010. In a real environment, you would not serve unencryted HTTP or open up the non-standard port 8010 to the outside (mind how listening on port 80 needs superuser privileges). Also, our server contains more than just the buildbot UI: depending on your actual use case for CI/CD, you may also want to serve the build logs and artifacts (such as built software). Hence, we serve the UI with nginx (any other server with HTTP and Web Sockets support would work just as well), and you can later configure yourself which data you are serving to outside users, allowing everyone to see everything and even to trigger builds. By the way, the buildbot UI by default does not perform user authorization. HTTPS is not covered in this tutorial&#8201;&#8212;&#8201;we will use plain HTTP for test purposes. Nevertheless, the nginx configuration presented below works if you enable SSL/TLS.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># On jailhost sudo pkg install nginx sudo tee /usr/local/etc/nginx/nginx.conf &lt;&lt;EOF events { worker_connections 1024; } http { include mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; server { listen 80; server_name localhost; location / { root /usr/local/www/nginx; index index.html index.htm; } location /buildbot/ { proxy_pass; } location /buildbot/sse/ { # proxy buffering will prevent sse to work proxy_buffering off; proxy_pass; } # required for websocket location /buildbot/ws { proxy_http_version 1.1; proxy_set_header Upgrade \$http_upgrade; proxy_set_header Connection "upgrade"; proxy_pass; # raise the proxy timeout for the websocket proxy_read_timeout 6000s; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/local/www/nginx-dist; } } } EOF sudo sysrc nginx_enable=YES sudo service nginx start</code></pre> </div> </div> <div class="paragraph"> <p>(mind again that <code>$</code> is escaped in the shell but not in the output file)</p> </div> <div class="paragraph"> <p>Remember the line <code> "forwarded_port", guest: 80, host: 8999</code> in our Vagrantfile? Vagrant&#8217;s networking is a little different in that access to a VM&#8217;s TCP ports is not directly possible, but typically achieved by a port forward which Vagrant establishes for you. You should therefore see a welcoming nginx example page at <a href="http://localhost:8999/" class="bare">http://localhost:8999/</a> (open in your computer&#8217;s browser).</p> </div> <div class="paragraph"> <p>Let us replace the page with an index of what&#8217;s on the server&#8201;&#8212;&#8201;the buildbot master is already active, while as mentioned, other items like serving build artifacts or logs might become important to you later (not in scope of this tutorial).</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh">sudo tee /usr/local/www/nginx/index.html &lt;&lt;EOF &lt;html&gt; &lt;body&gt; &lt;a href="/buildbot/"&gt;buildbot&lt;/a&gt; &lt;!-- Since there's only one thing here right now, let's redirect automatically until you figure out which artifacts you want to put here. --&gt; &lt;script&gt; window.location.href = "/buildbot/"; &lt;/script&gt; &lt;/body&gt; &lt;/html&gt; EOF</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="_run_your_first_build">Run your first build</h2> <div class="sectionbody"> <div class="paragraph"> <p>Reload the browser page. The buildbot UI should come up. There will be a warning about the configured <code>buildbotURL</code> because we use Vagrant&#8217;s port forwarding; in production, you should have direct access to <code></code> and configure the value accordingly.</p> </div> <div class="paragraph"> <p>Feel free to browse around the UI. You will find the example builder <code>runtests</code>, our single worker on host <code>worker0</code> and some other information already available. Since the example builder has a "force" scheduler configured, you can even trigger a first build now! Click "Builds &gt; Builders &gt; runtests &gt; force &gt; Start Build" and see how the build runs. It will fail when trying to run <code>trial</code>, the example project&#8217;s test runner because we have not installed this software on the worker (at time of writing, it was not available as separate FreeBSD package).</p> </div> <div class="paragraph"> <p><span class="image"><img src="/blog/2018-04-22-buildbot-setup-freebsd-jails/buildbot-www.png" alt="buildbot UI screenshot"></span></p> </div> <div class="paragraph"> <p>We are now ready to do something useful with our buildbot instance. Buildbot configuration and essentials are not covered in here&#8201;&#8212;&#8201;please read the <a href="">official documentation</a> to get started. The configuration at <code>/usr/jails/master/var/buildbot-master/master.cfg</code> is right at your fingertips and ready for editing. Here is an edit-and-reload workflow that you may need as "trial and error" strategy until you have successfully learned all the basics:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-sh" data-lang="sh"># Open a shell inside jail sudo jexec master sh # Make some changes and reload vi /var/buildbot-master/master.cfg service buildbot reload</code></pre> </div> </div> <div class="paragraph"> <p>The rc script&#8217;s <code>reload</code> command actually calls something like <code>buildbot reconfigure /var/buildbot-master</code> under the hood, telling our master process to reload the configuration.</p> </div> </div> </div> <div class="sect1"> <h2 id="_production_hints">Production hints</h2> <div class="sectionbody"> <div class="paragraph"> <p>We worked in a test virtual machine for this setup, but for production grade, you may still want to adapt a few things:</p> </div> <div class="ulist"> <ul> <li> <p>Think about using ZFS as filesystem so ezjail can take advantage of it (see manpage&#8217;s <a href="">Using ZFS</a> section). Official Vagrant images of FreeBSD are set up using UFS, not ZFS.</p> </li> <li> <p>In my company, I have set up buildbot to run package builds using <a href="">poudriere</a>. Poudriere performs clean builds by means of creating empty jails ("empty" = only FreeBSD base system installed but no packages) and starting the build within. For that to work within our buildbot worker jail, you need to allow it to create subjails, among other settings. At some point, especially if you are a friend of human-readable names and paths, you may run into the current FreeBSD mount point name length limit of 88 characters which will be <a href=";revision=318736">fixed in FreeBSD 12</a>. To work around that limitation <em>now</em>, you could set <code>ezjail_jaildir=/j</code> in ezjail.conf (<em>before</em> running <code>ezjail-admin install</code>) instead of using the longer path <code>/usr/jails</code>. Or you could choose shorter jail names like <code>w0</code> instead of <code>my-cool-project-worker0-freebsd-10.3</code>.</p> </li> <li> <p>Store the worker password in a separate file instead of hardcoding it in <code>master.cfg</code> (as done in the sample configuration). This allows you to share the configuration with software developers (e.g. commit to a version-controlled repo) or even allow them to edit it&#8201;&#8212;&#8201;without any security concerns.</p> </li> <li> <p>You should replace the sample worker name and password with own values, obviously.</p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_finished">Finished!</h2> <div class="sectionbody"> <div class="paragraph"> <p>The tutorial narrated about basics of FreeBSD jails and buildbot, followed by the setup of a test virtual machine featuring a buildbot master and single attached worker. With this in place, you can go on to implement your CI/CD intentions with buildbot&#8217;s explicit and programmable configuration. Good luck!</p> </div> </div> </div> Sun, 24 Apr 2017 00:00:00 -0000 Ansible best practices <div id="preamble"> <div class="sectionbody"> <div class="paragraph"> <p>Ansible can be summarized as tool for running automated tasks on servers that require nothing but Python installed on the remote side. Typically used as configuration management framework, Ansible comes with a set of key benefits:</p> </div> <div class="ulist"> <ul> <li> <p>Has simple configuration with YAML, avoiding copy-paste by applying customizable "roles"</p> </li> <li> <p>Uses inventories to scope and define the set of servers</p> </li> <li> <p>Fosters repeatable "playbook" runs, i.e. applying same configuration to a server twice should be idempotent</p> </li> <li> <p>Doesn&#8217;t suffer from feature matrix issues because by design it is a framework, not a full-fledged solution for configuration management. You cannot say "it supports only web servers X and Y, but not Z", as principally Ansible allows you to do <em>anything</em> that is possible through manual server configuration.</p> </li> </ul> </div> <div class="paragraph"> <p>For a full introduction to Ansible, better read the <a href="">documentation</a> first. This article assumes you have already made yourself familiar with the concepts and have some existing attempts of getting Ansible working for a certain use case, but want some guidance on improving the way you are working with Ansible.</p> </div> <div class="paragraph"> <p>The company behind Ansible gives <a href="">some official guidelines</a> which mostly relate to file structure, naming and other common rules. While these are helpful, as they are not immediately common sense for beginners, only a fraction of Ansible&#8217;s features and complexity of larger setups are touched by that small set of guidelines.</p> </div> <div class="paragraph"> <p>I would like to present my experience from roughly over 2 years of Ansible experience, during which I have used it for a test environment at work (allowing developers to test systems like in production), for configuring my laptop and eventually for setting up <em>this</em> server and web application, and also my home server (a Raspberry Pi).</p> </div> <h2 id="_table_of_contents" class="discrete dog-blog-breakpoint">Table of contents</h2> <div id="toc" class="toc"> <div id="toctitle" class="title"></div> <ul class="sectlevel1"> <li><a href="#_why_ansible_over_other_frameworks">Why Ansible over other frameworks?</a></li> <li><a href="#_choose_your_type_of_environment">Choose your type of environment</a> <ul class="sectlevel2"> <li><a href="#_testing">Testing</a></li> <li><a href="#_staging_production">Staging/production</a></li> <li><a href="#_both_non_production_and_production_with_one_ansible_setup">Both non-production and production with one Ansible setup</a></li> </ul> </li> <li><a href="#_careful_when_mixing_manual_and_automated_configuration">Careful when mixing manual and automated configuration</a></li> <li><a href="#_directory_structure">Directory structure</a></li> <li><a href="#_basic_setup">Basic setup</a></li> <li><a href="#inventory-safe-default">Ansible configuration</a></li> <li><a href="#_name_tasks">Name tasks</a></li> <li><a href="#avoid-skipping-items">Avoid skipping items</a></li> <li><a href="#_use_and_abuse_of_variables">Use and abuse of variables</a></li> <li><a href="#_tags">Tags</a></li> <li><a href="#_sudo_only_where_necessary">sudo only where necessary</a></li> <li><a href="#_assertions">Assertions</a></li> <li><a href="#_less_code_by_using_repetition_primitives">Less code by using repetition primitives</a></li> <li><a href="#_idempotency_done_right">Idempotency done right</a></li> <li><a href="#dynamic-inventory">Leverage dynamic inventory</a></li> <li><a href="#_modern_ansible_features">Modern Ansible features</a></li> <li><a href="#storing-sensitive-files">Off-topic: storing sensitive files</a></li> <li><a href="#_conclusion">Conclusion</a></li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_why_ansible_over_other_frameworks">Why Ansible over other frameworks?</h2> <div class="sectionbody"> <div class="ulist"> <ul> <li> <p>Honestly, I did not compare many alternatives because the Ansible environment at work already existed when I joined and soon I believed Ansible to be the best option. The usual suspects Chef and Puppet did not really please me because the recipes do not really look like "infrastructure as code", but are too declarative and hard to understand in detail without looking at many files&#8201;&#8212;&#8201;while in a typical Ansible playbook, the actions taken can be read top-down like code.</p> </li> <li> <p>Many years ago, I built my own solution to deploy my personal web applications (<a href="">"Site Deploy"</a>; UI-based). As hobby project, it never became popular or sophisticated enough, and eventually I learned that it suffers from the aforementioned feature matrix problem. Essentially it only supported the features relevant to me πŸ™„, without providing a framework to support anything on any server. Nevertheless, <em>Site Deploy</em> already had support for configuring hosts with their connection data and services, with the help of variable substitution in most places. Or in other words: the very basic concepts of Ansible.</p> </li> <li> <p>Size of the user-base says a lot (cf. <a href="">their 2016 recap</a>)</p> </li> <li> <p>Ansible aims at simple design, and becomes powerful by all the open-source modules to support services, applications, hardware, network, connections, etc.</p> </li> <li> <p>No server-side, persistent component required. Only Python needed to execute modules. Usual connection type is SSH, but custom modules are available for other types.</p> </li> <li> <p>Flat learning curve: once you understand the basic concepts (define hosts in inventory, set variables on different levels, write tasks in playbooks) and you know the commands/steps to configure a host manually, it&#8217;s easy to get started writing the same steps down in Ansible&#8217;s YAML format.</p> </li> <li> <p>Put simply, Ansible combines a set of hosts (inventory) with a list of applicable tasks (playbooks &amp; roles), customizable with variables (at different places), allowing you to use pre-defined or own task modules and plugins (connection, value lookup, etc.). If you rolled your own, generic configuration management, you probably could not implement its principles much simpler. Since the concepts are so clearly separated, the source code (Python) is easy enough to read, if ever needed. Usually you will only have 2 situations to look into Ansible source code: learning how modules should be implemented and finding out about changed behavior when upgrading Ansible. The latter is not common and only occurred to me when switching from Ansible 1.8/1.9.x to 2.2.x which was quite a big step both in features, deprecations and also Ansible source code architecture itself.</p> </li> <li> <p>Change detection and idempotency. Whenever a task is run, there may be distinct outcomes: successfully changed, failed, skipped, unchanged. After running a playbook, you will have an overview of which tasks actually made changes on the target hosts. Usually, one would design playbooks in a way that running it a second time only gives "unchanged" outcomes, and Ansible&#8217;s modules support this idea of idempotency&#8201;&#8212;&#8201;for example, a <code>command</code> task can be marked as "already done that before, no changes required" by specifying <code>creates: /file/created/by/command</code> β†’ once the file was successfully created, a repeated execution of the task module will not run the command again.</p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_choose_your_type_of_environment">Choose your type of environment</h2> <div class="sectionbody"> <div class="paragraph"> <p>Before we jump into practice, in the first thought we must consider what kind of Ansible-based setup we want to achieve, which greatly depends on the environment: work/personal, production/staging/testing, mixture of those&#8230;&#8203;</p> </div> <div class="sect2"> <h3 id="_testing">Testing</h3> <div class="paragraph"> <p>A test environment could have many faces: for instance, at my company we manage a separate Git repo for the test environment, unrelated to any production configuration and therefore very quick to modify for developers without lengthy code reviews or approval by devops, as no production system can be affected. Ansible is used to fully configure the system and our software within a virtual machine.</p> </div> <div class="paragraph"> <p>To spin up a VM, many solutions exist already&#8201;&#8212;&#8201;for instance <a href="">Vagrant</a> with a small provisioning script that installs everything required for Ansible (only Python πŸ˜‰) in the VM. We use a small Fabric script to bootstrap a FreeBSD VM and networking before continuing with Ansible.</p> </div> </div> <div class="sect2"> <h3 id="_staging_production">Staging/production</h3> <div class="paragraph"> <p>You should keep separate inventories for staging and production. If you don&#8217;t have staging, you should probably aim at automating staging setup with Ansible, since you already develop the production configuration in playbooks. But if you have both, the below recommendations apply.</p> </div> </div> <div class="sect2"> <h3 id="_both_non_production_and_production_with_one_ansible_setup">Both non-production and production with one Ansible setup</h3> <div class="ulist"> <ul> <li> <p>When deploying both non-production and production environments from the same roles/playbooks, you must take care they don&#8217;t interfere with each other. For instance, you don&#8217;t want to send real e-mails to customers from staging, use different domain names, etc. The main way to decide on applying non-production vs. production properties should be your use of inventories and variables. An example will be discussed below (<a href="#dynamic-inventory">dynamic inventory</a>).</p> </li> <li> <p>Careful&#8201;&#8212;&#8201;developers should not have live credentials such as SSH access to a production server, but probably be able to manage testing/staging systems?!</p> </li> <li> <p>GPG encryption of sensitive files or other protection to disallow unprivileged people from accessing production machines at all (mentioned in section <a href="#storing-sensitive-files">Storing sensitive files</a>)</p> </li> <li> <p>A safe default choice for inventories is required, and the default should most probably <em>not</em> be production. This is described below in the section <a href="#inventory-safe-default">Ansible configuration</a>.</p> </li> </ul> </div> </div> </div> </div> <div class="sect1"> <h2 id="_careful_when_mixing_manual_and_automated_configuration">Careful when mixing manual and automated configuration</h2> <div class="sectionbody"> <div class="paragraph"> <p>If you already have a production system manually set up&#8201;&#8212;&#8201;which is almost always the case, at least for initial OS installation steps which cannot be done via Ansible on physical servers&#8201;&#8212;&#8201;making the switch to fully automated configuration via Ansible is not easy. You may want to introduce automation step-by-step.</p> </div> <div class="paragraph"> <p>There are many imaginable ways to achieve that migration. I want to propose what I would do, admittedly without any real-world experience because I do not manage any production systems as developer.</p> </div> <div class="ulist"> <ul> <li> <p>Develop playbooks and maintain <a href="">check mode and the <code>--diff</code> option</a>. This is not always easy and sometimes unnerving because you have to think both in normal mode (read-write) and check mode (read-only) when writing tasks, and apply appropriate options for modules that can&#8217;t handle it themselves (like <code>command</code>):</p> <div class="ulist"> <ul> <li> <p><code>check_mode: no</code> (previously called <code>always_run: yes</code>)</p> </li> <li> <p><code>changed_when</code></p> </li> <li> <p>If you use tags: apply <code>tags: [ always ]</code> to tasks that e.g. provide results for subsequent tasks</p> </li> </ul> </div> </li> <li> <p>Take care when making manual changes to servers. While often okay and necessary to react quickly, ensure the responsible people (e.g. devops team) can later reproduce the setup rather sooner than later with playbooks.</p> </li> <li> <p>Use <a href=""><code>{{ ansible_managed }}</code></a> to mark auto-generated files as such, so nobody unknowingly edits them manually</p> </li> <li> <p>Automate as much setup as you can, but only the parts that you are able to implement via Ansible without risk. For example, if you fear that an automatic database setup could go horribly wrong (like overwrite the existing production database), then rely on your distrust and do those steps manually.</p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_directory_structure">Directory structure</h2> <div class="sectionbody"> <div class="paragraph"> <p>Some <a href="">common directory layouts</a> are already part of the official documentation. In addition, you may want to separate your playbooks in subdirectories of <code>playbooks/</code> once your content grows too large. This cannot really be handled well in best practices because size and purpose of each project varies, so I just leave this on you to decide when time comes to "clean up". Note that if you use several playbook (sub-)directories and files relative to them (such as a custom <code>library</code> folder), you may have to symlink into the each directory containing playbooks.</p> </div> </div> </div> <div class="sect1"> <h2 id="_basic_setup">Basic setup</h2> <div class="sectionbody"> <div class="ulist"> <ul> <li> <p>It should be clear that Ansible uses text files and therefore should be versioned in a VCS like Git. Make sure you ignore files that should not be committed (for example in .gitignore: <code>*.retry</code>).</p> </li> <li> <p>Add something like <code>alias apl=ansible-playbook</code> in your shell. Or do you want to type <code>ansible-playbook</code> all the time?</p> </li> <li> <p>Require users to use at least a certain Ansible version, e.g. the latest version available in OS package managers at the time of starting your endeavors. You could have a little role <code>check-preconditions</code> doing this:</p> </li> </ul> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml"># Check and require certain Ansible version. You should document why that # version is required, for instance: # # We require Ansible 2.2.1 or newer, see changelog # ( # &gt; Fixes a bug where undefined variables in with_* loops would cause a task # &gt; failure even if the when condition would cause the task to be skipped. - name: Check Ansible version assert: that: '(ansible_version.major, ansible_version.minor, ansible_version.revision) &gt;= (2, 2, 1)' msg: 'Please install the recommended version 2.2.1+. You have Ansible {{ ansible_version.string }}.' run_once: yes</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="inventory-safe-default">Ansible configuration</h2> <div class="sectionbody"> <div class="paragraph"> <p><a href=""><code>ansible.cfg</code></a> allows you to tweak many settings to be a little saner than the defaults.</p> </div> <div class="paragraph"> <p>I recommend the following:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-ini" data-lang="ini">[defaults] # Default to no fact gathering because it's slow and "explicit is better # than implicit". Depending how you use variables, you may rather explicitly # define variables instead of relying on facts. You can enable this on # a per-playbook basis with `gather_facts: yes`. gathering = explicit # You should default either 1) to a non-risky inventory (not production) # or 2) point to a nonexistent one so that the person explicitly needs to # specify which one to use. I find the alternative 1) the least risky, # because 2) may lead to people creating shortcuts to deploy to live machines # which defeats the purpose of having a safer default here. inventory = inventories/test # Cows are scared of playbook developers nocows = 1 # Point to your local collection of extras, e.g. roles roles_path = ./roles [ssh_connection] # Enable SSH multiplexing to increase performance pipelining = True control_path = /tmp/ansible-ssh-%%h-%%p-%%r</code></pre> </div> </div> <div class="paragraph"> <p>Choosing a safe default for the inventory is obviously important, thinking about recent catastrophic events like the <a href="">Amazon S3 outage</a> that originated from a typo. Inventory names should not be confusable with each other, e.g. avoid using a prefix (<code>inv_live</code>, <code>inv_test</code>) because people hastily using tab completion may quickly introduce a typo.</p> </div> <div class="paragraph"> <p>If you are annoyed by <code>*.retry</code> files being created next to playbooks which hinders filename tab completion, an environment variable <code>ANSIBLE_RETRY_FILES_SAVE_PATH</code> lets you put them in a different place. For myself, I never use them as I&#8217;m not working with hundreds of hosts matching per playbook, so I just disable them with <code>ANSIBLE_RETRY_FILES_ENABLED=no</code>. Since that is a per-person decision, it should be an environment variable and not go into <code>ansible.cfg</code>.</p> </div> </div> </div> <div class="sect1"> <h2 id="_name_tasks">Name tasks</h2> <div class="sectionbody"> <div class="paragraph"> <p>While already outlined in the <a href="">mentioned best practices article</a>, I&#8217;d like to stress this point: names, comments and readability enable you and others to understand playbooks and roles later on. Ansible output on its own is too concise to really tell you the exact spot which is currently executing, and sometimes in large setups you will be searching that spot where you canceled (Ctrl+C) or a task failed fatally. Naming even the single tasks comes in handy here. Or tooling like <a href="">ARA</a> which I personally did not try yet (overkill for me). After all we&#8217;re doing programming, and no reasonable language would allow you to make public functions unnamed/anonymous.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- name: 'Create directories for service {{ daemontools_service_name }}' file: state: directory dest: '{{ item }}' owner: '{{ daemontools_service_user }}' with_items: '{{ daemontools_service_directories }}'</code></pre> </div> </div> <div class="paragraph"> <p>In recent versions of Ansible, variables in the task <code>name</code> will be correctly substituted by their value in the console output, giving you visual feedback which part of the play is executing. That will be especially important once your configuration management project is growing and you run large collections of playbooks that execute a certain role (this example: <code>daemontools_service</code>) multiple times, for example to create a couple of permanent services.</p> </div> <div class="paragraph"> <p>Another advantage of this technique is that you can start where a play canceled/failed previously using the <code>--start-at-task="Task name"</code> option. That might not always work, e.g. if a task depends on a previously <code>register:</code>-ed variable, but is often helpful to save time by skipping all previously succeeded tasks. If you use static task names like "Install packages", then <code>--start-at-task="Install packages"</code> will start at the first occurrence of that task name in the play instead of a specific one ("Install dependencies for service XYZ").</p> </div> </div> </div> <div class="sect1"> <h2 id="avoid-skipping-items">Avoid skipping items</h2> <div class="sectionbody"> <div class="paragraph"> <p>&#8230;&#8203;because it might hurt idempotency. What if your Ansible playbook adds a cronjob based on a boolean variable, and later you change the value to false? Using <code>when: my_bool</code> (value now changed to <code>no</code>) will skip the task, leaving the cronjob intact even though you expected it to be removed or disabled.</p> </div> <div class="paragraph"> <p>Here&#8217;s a slightly more complicated example: I had to set up a service that should be disabled by default until the developer enables it (because it would log error messages all the time unless the developer had established a required, manual SSH tunnel). Considerations:</p> </div> <div class="ulist"> <ul> <li> <p>When configuring that service (let&#8217;s call the role <code>daemontools_service</code>; <a href="">daemontools</a> are great to set up and manage services on *nix), we cannot simply enable/disable the service conditionally: the service should only be disabled initially (first playbook run = service created for the first time on remote machine) and on boot, but its state should be untouched if the developer had already enabled the service manually. Or in other words (since that fact is not easy to find out), leave state untouched if the service was already configured by a previous playbook run (= idempotency).</p> </li> <li> <p>You might also want an option to toggle enabling/disabling the service by default, so I&#8217;ll show that as well</p> </li> </ul> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- hosts: xyz vars: xyz_service_name: xyz-daemon # Knob to enable/disable service by default (on reboot, and after # initial configuration) xyz_always_enabled: yes roles: - role: daemontools_service daemontools_service_name: '{{ xyz_service_name }}' # Contrived variable, leaving state untouched should be the default # behavior unless you want to risk in production that services are # unintentionally enabled or disabled by a playbook run. daemontools_service_enabled: 'do_not_change_state' daemontools_service_other_variables: ... tasks: - name: Disable XYZ service on boot cron: # We know that the role will symlink into /var/service, # as usual for daemontools job: "svc -d /var/service/{{ xyz_service_name }}" name: "xyz_default_disabled" special_time: "reboot" disabled: "{{ xyz_always_enabled }}" # ...or... # state: "{{ 'absent' if xyz_always_enabled else 'present' }}" tags: [ cron ] - name: Disable XYZ service initially # After *all* initial configuration steps succeeded, take the service # down (`svc -d`) and mark the service as created so we... shell: "svc -d /var/service/{{ xyz_service_name }} &amp;&amp; touch /var/service/{{ xyz_service_name }}/.created" args: # ...don't disable the service again if playbook is run again # (as someone may have enabled the service manually in the meantime). creates: "/var/service/{{ xyz_service_name }}/.created" when: not xyz_always_enabled tags: [ cron ]</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="_use_and_abuse_of_variables">Use and abuse of variables</h2> <div class="sectionbody"> <div class="paragraph"> <p>The most important principle for variables is that you should know which variables are used when looking at a portion of "Ansible code" (YAML). As an Ansible beginner, you might have 1) wondered a few times, or looked up, in which <a href="">order of precedence</a> variables are taken into account. Or 2) you might have just given up and asked the author what is happening there. Like in software development, both 1) and 2) are fatal mistakes that hamper productivity&#8201;&#8212;&#8201;code must be readable (hopefully top-down or by looking within the surrounding 100 lines) and understandable by colleagues and other contributors. The case that you even <em>had</em> to check the precedence shows the problem in the first place! <strong>Variables should be specified at exactly one place</strong> (or two places if a variable has a reasonable, overridable default value), <strong>as close as possible to their usage</strong> while still being at the relevant location and <strong>most variables should be ultimately mandatory</strong> so that Ansible loudly complains if a variable is missing. Let us look at a few examples to see what these basic rules mean.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-ini" data-lang="ini">[exampleservers] [all:vars] # Global helper variables. # # I tend to use these specific ones because when inside a role, Ansible 1.9.x # did not correctly find files/templates in some cases (if called from playbook # or dependency of other role). Not sure if that is still required for 2.x, # so don't copy-paste without understanding the need! These are really # just examples. my_playbooks_dir={{ inventory_dir + "/../playbooks" }} my_roles_dir={{ inventory_dir + "/../roles" }} # With dynamic inventories, you can structure your per-host and per-group # variables in a nicer way than this INI file top-down format. If you use # INI files, at least try to create some structure, like alphabetical sorting # for hosts and groups. [exampleservers:vars] # Here, put only variables that belong to matching servers in general, # not to a functional component ansible_ssh_user=dog</code></pre> </div> </div> <div class="paragraph"> <p>Let&#8217;s look at an example role "mysql" which installs a MySQL server, optionally creates a database and then optionally gives privileges to the database (also allows value <code>*</code> for all databases) to a user:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml"># ...contrived excerpt... - name: Ensure database {{ database_name }} exists mysql_db: name: 'ourprefix_{{ database_name }}' when: database_name is defined and database_name != "*" - name: Ensure database user {{ database_user }} exists and has access to {{ database_name }} mysql_user: name: '{{ database_user }}' password: '{{ database_password }}' priv: '{{ database_name }}.*:ALL' host: '%' when: database_user is defined and database_user # ...</code></pre> </div> </div> <div class="paragraph"> <p>The good parts first:</p> </div> <div class="ulist"> <ul> <li> <p>Once <code>database_user</code> is given, the required variable <code>database_password</code> is mandatory, i.e. not checked with another <code>database_password is defined</code>.</p> </li> <li> <p>Variables used in task names, so that Ansible output clearly tells you what <em>exactly</em> is currently happening</p> </li> </ul> </div> <div class="paragraph"> <p>But many things should be fixed here:</p> </div> <div class="ulist"> <ul> <li> <p>Role (I called this example role "mysql") is doing way too many things at once without having a proper name. It should be split up into several roles: MySQL server installation, database creation, user &amp; privilege setup. If you really find yourself doing these three things together repeatedly, you can still create an uber-role "mysql" that depends on the others.</p> </li> <li> <p>Role variables should be prefixed with the role name (e.g. <code>mysql_database_name</code>) because Ansible has no concept of namespaces or scoping these variables only to the role. This helps finding out quickly where a variable comes from. In contrast, host groups in Ansible are a way to scope variables so they are only available to a certain set of hosts.</p> </li> <li> <p>The database name prefix <code>ourprefix_</code> seems to be a hardcoded string. First of all, this led to a bug&#8201;&#8212;&#8201;privileges are not correctly applied to the user in the second task because the prefix was forgotten. The hardcoded string could be an internal variable (mark those with an underscore!) defined in the defaults file <code>roles/mysql/defaults/main.yml</code>: <code>_database_name_prefix: 'ourprefix_' # comment describing why it&#8217;s hardcoded</code>, and must be used wherever applicable. Whenever the value needs changing, you only need to touch one location.</p> </li> <li> <p>The special value <code>database_name: '*'</code> must be considered. Because the role has more than one responsibility (remember software engineering best practices?!), the variables have too many meanings. As said, there had better be a role "mysql_user" that only handles user creation and privileges&#8201;&#8212;&#8201;inside such a scoped role, using <em>one</em> special value turns out to be less bug-prone.</p> </li> <li> <p><code>database_user is defined and database_user</code> is again only necessary because the role is doing too much. In general, you should almost never use such a conditional. For no real reason, an empty value is principally allowed, and the task skipped in that case, and also if the variable is not specified. Once you decide to rename the variable and forget to replace one occurrence, you suddenly always skip the task. Whenever you can, let Ansible complain loudly when a variable is undefined, instead of e.g. skipping a task conditionally. In this example, splitting up the role is the solution to immediately make the variables mandatory. In other cases, you could introduce a default value for a role variable and allow users to override that value.</p> </li> </ul> </div> <div class="paragraph"> <p>Other practices regarding variables and their values and inline templates:</p> </div> <div class="ulist"> <ul> <li> <p>Consistently name your variables. Just like code, Ansible plays should be grep-able. A simple text search through your Ansible setup repo should immediately find the source of a variable and other places where it is used.</p> </li> <li> <p>Avoid indirections like includes or <code>vars_files</code> if possible to keep relevant variables close to their use. In some cases, these helpers can shorten repeated code, but usually they just add one more level of having to jump around between files to grasp where a value comes from.</p> </li> <li> <p>Don&#8217;t use the special one-line dictionary syntax <code>mysql_db: name="{{ database_name }}" state="present" encoding="utf8mb4"</code>. YAML is very readable per se, so why use Ansible&#8217;s crippled syntax instead? It&#8217;s okay to use for single-variable tasks, though.</p> </li> <li> <p>On the same note, remove defaults which are obvious, such as the usual <code>state: present</code>. The "official" blog post on best practices recommends otherwise, but I like to keep code short and boilerplate-less.</p> </li> <li> <p>Decide for one quoting style and use it consistently: double quotes (<code>dest: "/etc/some.conf"</code>), single quotes (<code>dest: '/etc/some.conf'</code>) plus decision if you quote things that don&#8217;t need it (<code>dest: /etc/some.conf</code>). Keep in mind that <code>dest: {{ var }}</code> is not possible (must be quoted), and that <code>mode: 0755</code> (chmod) will give an unexpected result (no octal number support), so recommended practice is of course <code>mode: '0755'</code>.</p> </li> <li> <p>Also decide for one style for spacing and writing Jinja templates. I prefer <code>dest: '{{ var|int + 5 }}'</code> over <code>dest: '{{var | int + 5}}'</code> but only staying consistent is key, not the style you choose.</p> </li> <li> <p>You don&#8217;t need <code>---</code> at the top of YAML files. Just leave them away unless you know what it means.</p> </li> </ul> </div> <div class="paragraph"> <p>More rules can be shown best in a playbook example:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- hosts: web-analytics-database vars: # Under `vars`, only put variables that really must be available in several # roles and tasks below. They have high precedence and therefore are prone # to clash with other variables of the same name (if you didn't follow # the principle of only one definition), or may set a value in one of the # below roles that you didn't want to be set! Therefore the role name # prefix is so important (`mysql_user_name` instead of `username` because # the latter might also be used in many other places and is hard to grep # for if used all over the place). # When writing many playbooks, you probably don't want to hardcode your # DBA's username everywhere, but define a variable `database_admin_username`. # The rule of putting it as close as possible to its use tells you to # create a group "database-servers" containing all database hosts and put # the variable into `group_vars/database-servers.yml` so it's only available # in the limited scope. # Using variable name prefix `wa_` for "web analytics" as example. wa_mysql_user_name_prefix: '{{ database_admin_username }}' roles: - role: mysql_server # [Comment describing why we chose MySQL 5.5...] # Alternatively (but more risky than requiring it to be defined explicitly), # this might have a default value in the role, stating the version you # normally use in production. mysql_server_version: '5.5' # Admin with full privileges - role: mysql_user mysql_user_name: '{{ wa_mysql_user_name_prefix }}_admin' # This should not have a default. Defaulting to `ALL` means that on a # playbook mistake, a new user may get all privileges! mysql_user_privileges: 'ALL' # Production passwords should not be committed to version control # in plaintext. See article section "Storing sensitive files". mysql_user_password: '{{ lookup("gpgfile", "secure/web-analytics-database.password") }}' # Read-only access - role: mysql_user mysql_user_name: '{{ wa_mysql_user_name_prefix }}_readonly' mysql_user_privileges: 'SELECT' mysql_user_password: '{{ lookup("gpgfile", "secure/web-analytics-database.readonly.password") }}' tasks: # With well-developed roles, you don't need extra {pre_}tasks!</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="_tags">Tags</h2> <div class="sectionbody"> <div class="paragraph"> <p>Use tags only for limiting to tasks for speed reasons, as in "only update config files". They should not be used to select a "function" of a playbook or perform regular tasks, or else one fine day you may forget to specify <code>-t only-do-xyz</code> and it will take down Amazon S3 or so 😜. It&#8217;s a debug and speed tool and not otherwise necessary. Better make your playbooks smaller and more task-focused if you use playbooks for repeated (maintenance) tasks.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- hosts: webservers pre_tasks: - name: Include some vars (not generally recommended, see rules for variables) include_vars: file: myvars.yml # This must be tagged `always` because otherwise the variables are not available below tags: [ always ] roles: - role: mysql # ... - role: mysql_user # ... tasks: - name: Insert test data into SQL database # Mark with a separate tag that allows you to quickly apply new test # data to the existing MySQL database without having to wait for the # `mysql*` roles to finish (which would probably finish without changes). tags: [ test-sql ] # ...the task... - name: Get system info # Contrived example command - in reality you should use `ansible_*` facts! command: 'uname -a' register: _uname_call # This needs tag `always` because the below task requires the result # `_uname_call`, and also has tags. tags: [ always ] check_mode: no # Just assume this task to be "unchanged"; instead tasks that depend # on the result will detect changes. changed_when: no - name: Write system info copy: content: 'System: {{ _uname_call.stdout }}' dest: '/the/destination/path' tags: [ info ]</code></pre> </div> </div> </div> </div> <div class="sect1"> <h2 id="_sudo_only_where_necessary">sudo only where necessary</h2> <div class="sectionbody"> <div class="quoteblock"> <blockquote> <div class="paragraph"> <p>The command failed, so I used <code>sudo command</code> and it worked fine. I&#8217;m now doing that everywhere because it&#8217;s easier.</p> </div> </blockquote> </div> <div class="paragraph"> <p>It should be obvious to devops people, and hopefully also software developers, how very wrong this is. Just like you would not do that for manual commands, you also should not use <code>become: yes</code> globally for a whole playbook. Better only use it for tasks that actually need root rights. The <code>become</code> flag can be assigned to task blocks, avoiding repetition.</p> </div> <div class="paragraph"> <p>Another downside of "sudo everywhere" is that you have to take care of owner/group membership of directories and files you create, instead of defaulting to creating files owned by the connecting user.</p> </div> </div> </div> <div class="sect1"> <h2 id="_assertions">Assertions</h2> <div class="sectionbody"> <div class="paragraph"> <p>If you ever had a to debug a case where a YAML dictionary was missing a key, you will know how bad Ansible is at telling you where an error came from (does not even tell you the dictionary variable name). I have found my own way to deal with that: assert a condition before actually running into the default error message. Only a very simple plugin is required. I opened a <a href="">pull request</a> already but the maintainers did not like the approach. Still I will recommend it here because of practical experience.</p> </div> <div class="paragraph"> <p>In <code>ansible.cfg</code>, ensure you have:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-ini" data-lang="ini">filter_plugins = ./plugins/filter</code></pre> </div> </div> <div class="paragraph"> <p>Then add the plugin <code>plugins/filter/</code>:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-python" data-lang="python">from ansible import errors def _assert(value, msg=''): # You can leave this condition away if you think it's too strict. # It's supposed to help find typos and type mistakes in assertion conditions. if not isinstance(value, bool): raise errors.AnsibleFilterError('assert filter requires boolean as input, got %s' % type(value)) if not value: raise errors.AnsibleFilterError('assertion failed: %s' % (msg or '&lt;no message given&gt;',)) return '' class FilterModule(object): filter_map = { 'assert': _assert, } def filters(self): return self.filter_map</code></pre> </div> </div> <div class="paragraph"> <p>And use it like so:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- name: My task command: 'somecommand {{ (somevar|int &gt; 5)|assert("somevar must be number &gt; 5") }}{{ somevar }}'</code></pre> </div> </div> <div class="paragraph"> <p>This will only be able to test Jinja expressions, which are mostly but not 100% Python, but that should be enough.</p> </div> </div> </div> <div class="sect1"> <h2 id="_less_code_by_using_repetition_primitives">Less code by using repetition primitives</h2> <div class="sectionbody"> <div class="paragraph"> <p>Ever wrote something like this?</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- name: Do something with A command: dosomething A args: creates: /etc/somethingA when: '{{ is_admin_user["A"] }}' - name: Do something with B command: dosomething --a-little-different B args: creates: /etc/somethingB when: '{{ is_admin_user["B"] }}'</code></pre> </div> </div> <div class="paragraph"> <p>A little exaggerated, but chances are that you suffered from copy-pasting too much Ansible code a few times in your configuration management career, and had the usual share of copy-paste mistakes and typos. Use <a href=""><code>with_items</code> and friends</a> to your advantage:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-yaml" data-lang="yaml">- name: Do something with {{ }} # At a task-level scope, it's totally okay to use non-mandatory variables # because you have to read only these few lines to understand what it's # doing. Use quoting if you want to support e.g. whitespace in values - just # saying, of course it's unusual on *nix... command: 'dosomething {{ item.args|default("") }} "{{ }}"' args: creates: '/etc/something{{ }}' # This is again following the rule of mandatory variables: making dictionary # keys mandatory protects you from typos and, in this case, from forgetting # to add people to a list. Get a good error message instead of just # `KeyError: B` by using the aforementioned assert module. when: '{{ in is_admin_user|assert("User " + + " missing in is_admin_user") }}{{ is_admin_user[] }}' with_items: - name: A - name: B args: '--a-little-different'</code></pre> </div> </div> <div class="paragraph"> <p>More readable (once it gets bigger than my contrived example), and still does the same thing without being prone to copy-paste mistakes and complexity.</p> </div> </div> </div> <div class="sect1"> <h2 id="_idempotency_done_right">Idempotency done right</h2> <div class="sectionbody"> <div class="paragraph"> <p>This term was already mentioned a few times above. I want to give more hints on how to achieve repeatable playbook runs. "Idempotent" effectively means that on the second run, everything is green and no actual changes happened, which Ansible calls "ok" but in a well-developed setup means "unchanged" or "read-only action was performed".</p> </div> <div class="paragraph"> <p>The advantages should be pretty clear: not only can you see the exact <code>--diff</code> of what would happen on remote servers but also it gives visual feedback of what has <em>really</em> changed (even if you don&#8217;t use diff mode).</p> </div> <div class="paragraph"> <p>Only a few considerations are necessary when writing tasks and playbooks, and you can get perfect idempotency in most cases:</p> </div> <div class="ulist"> <ul> <li> <p>Avoid skipping items in certain cases (explained <a href="#avoid-skipping-items">above</a>)</p> </li> <li> <p>Often you need a <code>command</code> or <code>shell</code> task to perform very specific work. These tasks are always considered "changed" unless you define e.g. the <code>creates</code> argument or use <code>changed_when</code>.<br> Example: <code>changed_when: _previously_registered_process_result.stdout == ''</code><br> On the same note, you may want to use <code>failed_when</code> in special cases, like if a program exits with code 0 even on errors.</p> </li> <li> <p>Always use same inputs. For example, don&#8217;t write a new timestamp into a file at every task run, but detect that the file is already up-to-date and does not need to be changed.</p> </li> <li> <p>Use built-in modules like <code>lineinfile</code>, <code>file</code>, <code>synchronize</code>, <code>copy</code> and <code>template</code> which support the relevant arguments to get idempotency if used right. They also typically fully support checked mode and other features that are hard to achieve yourself. Avoid <code>command</code>/<code>shell</code> if built-ins can be used instead.</p> </li> <li> <p>The argument <code>force: no</code> can be used for some modules to ensure that a task is only run once. For instance, you want a configuration template copied once if not existent, but afterwards manage it manually or with other tools, use <code>copy</code> and <code>force: no</code> to only upload the file if not yet existent, but on repeated run don&#8217;t make any changes to the existing remote file. This is not exactly related to idempotency but sometimes a valid use case.</p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="dynamic-inventory">Leverage dynamic inventory</h2> <div class="sectionbody"> <div class="paragraph"> <p>Who needs to fiddle around carefully in check mode every time you change a production system, if there&#8217;s a staging environment which can bear a downtime if something goes wrong? Dynamic inventories can help separate staging and production in the most readable and&#8201;&#8212;&#8201;you guessed it&#8201;&#8212;&#8201;dynamic way.</p> </div> <div class="paragraph"> <p>Separate environments like test, staging or production of course have different properties like</p> </div> <div class="ulist"> <ul> <li> <p>IP addresses and networks</p> </li> <li> <p>Host and domain names (FQDN)</p> </li> <li> <p>Set of hosts. Production software may be distributed to multiple servers, while your staging may simply be installed on one server or virtual machine.</p> </li> <li> <p>Other values</p> </li> </ul> </div> <div class="paragraph"> <p>Ideally, all of these should be specified in variables, so that you can use different values for each environment in the respective inventory, but with consistent variable names. In your roles and playbooks, you can then mostly ignore the fact that you have different environments&#8201;&#8212;&#8201;except for tasks that e.g. should not or only run in production, but that should also be decided by a variable (β†’ <code>when: not is_production</code>).</p> </div> <div class="paragraph"> <p>Check the official introduction to <a href="">Dynamic Inventories</a> and <a href="">Developing Dynamic Inventory Sources</a> to understand my example inventory script. It forces the domain suffix <code>.test</code> for the "test" environment, and no suffix for the "live" environment.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-python" data-lang="python">#!/usr/bin/env python from __future__ import print_function import argparse import json import os import sys SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) # One way to go "dynamic": decide inventory type (test, staging, production) # based on inventory directory. Remember that Ansible calls the first file # found if you specify a directory as inventory. Symlinking the same script # into different directories allows you to use one inventory script # for several environments. IS_LIVE = {'live': True, 'test': False}[os.path.basename(SCRIPT_DIR)] DOMAIN_SUFFIX = '' if IS_LIVE else '.test' host_to_vars = { 'first': { 'public_ip': '', 'public_hostname': '', }, 'second': { 'public_ip': '', 'public_hostname': '', }, } groups = { 'webservers': ['first', 'second'], } # Avoid human mistakes by applying test settings everywhere at once (instead # of inline per-variable) for host, variables in host_to_vars.items(): if 'public_hostname' in variables: # Just an example. Realistically you may want to change `public_ip` # as well, plus other variables that differ between test and production. variables['public_hostname'] += DOMAIN_SUFFIX if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('--debug', action='store_true', default=False) parser.add_argument('--host') parser.add_argument('--list', action='store_true', default=False) args = parser.parse_args() def printJson(v): print(json.dumps(v, sort_keys=True, indent=4 if args.debug else None, separators=(',', ': ' if args.debug else ':'))) if is not None: printJson(host_to_vars.get(, {})) elif args.list: # Allow Ansible to only make one call to this script instead # of one per host. # See groups['_meta'] = { 'hostvars': host_to_vars, } printJson(groups) else: parser.print_usage(sys.stderr) print('Use either --host or --list', file=sys.stderr) exit(1)</code></pre> </div> </div> <div class="paragraph"> <p>Much more customization is possible with dynamic inventories. Another example: in my company, we use FreeBSD servers with our software installed and managed in jails. For developer testing, we have an Ansible setup to roughly resemble the production configuration. Unfortunately, at the time of writing, Ansible does not directly support configuration of jails or a concept of "child hosts". Therefore, we simply created an SSH connection plugin to connect to jails. Each jail looks like a regular host to Ansible, with the special naming pattern <code>jailname@servername</code>. Our dynamic inventory allows us to easily configure the hierarchy of groups &gt; servers &gt; jails and all their variables.</p> </div> <div class="paragraph"> <p>For personal and simple setups, in which only a few servers are involved, you might as well just use the INI-style inventory file format that Ansible uses by default. For the above example inventory, that would mean to split into two files <code>test.ini</code> and <code>live.ini</code> and managing them separately.</p> </div> <div class="paragraph"> <p>Dynamic inventories have one major downside compared to INI files: they don&#8217;t allow text diffs. Or in other words, you see the script change when looking at your VCS history, not the inventory diff. If you want a more explicit history, you may want a different setup: auto-generate INI inventory files with some script or template, then commit the INI files whenever you change something. Of course you will have to make sure to actually re-generate the files (potential for human mistakes!). I will leave this as exercise to you to decide.</p> </div> </div> </div> <div class="sect1"> <h2 id="_modern_ansible_features">Modern Ansible features</h2> <div class="sectionbody"> <div class="paragraph"> <p>While you may have introduced Ansible years back when it was still in v1.x or earlier stages, the framework is in very active development both by Red Hat and the community. <a href="">Ansible 2.0</a> introduced many powerful features and preparations for future improvements:</p> </div> <div class="ulist"> <ul> <li> <p><a href="">Task blocks (try-except-finally)</a>: useful to perform cleanups if a block of tasks should be applied "either all or none of the tasks". Also can reduce repeated code because you can apply <code>when</code>, <code>become</code> and other flags to a block.</p> </li> <li> <p><a href="">Dynamic includes</a>: you can now use variables in includes, e.g. <code>- include: 'server-setup-{{ environment_name }}.yml'</code></p> </li> <li> <p><a href="">Conditional roles</a> are nothing new. I had some trouble with related bugs in 1.8.x, but those are obviously resolved and <code>role: [&#8230;&#8203;] when: somecondition</code> can help in some use cases to make code cleaner (similar to task blocks).</p> </li> <li> <p>Plugins were refactored to cater for clean, more maintainable APIs, and more changes will come in 2.x updates (like the persistent connections framework). Migrating your own library to 2.x should be simple in most cases.</p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="storing-sensitive-files">Off-topic: storing sensitive files</h2> <div class="sectionbody"> <div class="paragraph"> <p>For this special use case, I don&#8217;t have a recommendation since I never compared different approaches.</p> </div> <div class="paragraph"> <p><a href="">Vault support</a> seems to be a good start but seems to only support protection by a single password&#8201;&#8212;&#8201;a password which you then have to share among the team.</p> </div> <div class="paragraph"> <p>Several <a href="">built-in lookups</a> exist for password retrieval and storage, such as "password" (only supports plaintext) and Ansible 2.3&#8217;s "passwordstore".</p> </div> <div class="paragraph"> <p>In my company, we store somewhat sensitive files (such as passwords for financial test systems) in our developers' Ansible test environment repository, but in GPG-encrypted form. A script contains a list of files and people and encrypts the files. The encrypted .gpg files are committed, while original files should be in <code>.gitignore</code>. Within playbooks, we use a lookup plugin to decrypt the respective files. That way, access can be limited to a "need to know" group of people. While this is not tested for production use, it may be an idea to try and incorporate this extra level of security if you are dealing with sensitive information.</p> </div> </div> </div> <div class="sect1"> <h2 id="_conclusion">Conclusion</h2> <div class="sectionbody"> <div class="paragraph"> <p>Ansible can be complex and overwhelming after developing playbooks in a wrong way for a long time. Just like for source code, readability, simplicity and common practices do not come naturally and yet are important to keep your Ansible code base lean and understandable. I&#8217;ve shown basic and advanced principles and some examples to structure your setup. Many things are left out of this general article, because either I have no experience with it yet (like Ansible Galaxy) or it would just be too much for an introductory article.</p> </div> <div class="paragraph"> <p>Happy automation!</p> </div> </div> </div> Sun, 21 Dec 2016 00:00:00 -0000 Today I learned — episode 4 (numbers in JavaScript considered useless) <div id="preamble"> <div class="sectionbody"> <div class="paragraph dog-blog-hidden-in-overview"> <p>This blog series is supposed to cover short topics in software development, learnings from working in software companies, tooling, etc.</p> </div> </div> </div> <div class="sect1"> <h2 id="_numbers_in_javascript_considered_useless">Numbers in JavaScript considered useless</h2> <div class="sectionbody"> <div class="paragraph"> <p>For my hobby web application project, I wanted to implement a simple use case: my music player application needs to know the playback status including some other fields, and retrieves that status using AJAX calls to the local server. While that should be pretty fast in theory, every network request will slow down your (JavaScript) application, especially if we assume that the web server might not always be on localhost. An easy way to circumvent this are bidirectional WebSocket messages (here: server pushes status). However I&#8217;m playing with Rust and the <a href=""> web framework</a> so I just wanted a quick solution without having to add WebSocket support.</p> </div> <div class="paragraph"> <p>My idea was to just have the server sleep during the request until the playback status has actually changed. This way, the client makes a request which simply takes longer if the status remains unchanged, resulting in fewer connections being made. I added a GET parameter <code>previous_hash</code> to the URL so the server could check if the status had changed from what the client stored earlier. Using Rust&#8217;s <a href=""><code>Hash</code> trait</a>, it was very simple to create a <code>u64</code> hash of my struct and send the new hash back to the client.</p> </div> <div class="paragraph dog-blog-breakpoint"> <p>In Rust pseudo-code:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">router.get("/app/status", { middleware! { |request, mut response| let previous_hash : Option&lt;u64&gt; = request.query().get("previous_hash") .map(|s| s.parse::&lt;u64&gt;().expect("previous_hash not an integer")); response.set(MediaType::Json); // Delay response for some time if nothing changes, to help client make fewer calls let mut ret = None; for i in 0..100 { let status_response = get_status_response(); // unnecessary detail if previous_hash != Some(status_response.status_hash) { ret = Some(json::encode(&amp;status_response).unwrap()); break; } sleep_ms(50); } // If nothing changed while we slept ~100*50 milliseconds, just send latest status if ret.is_none() { let status_response = get_status_response(); ret = Some(json::encode(&amp;status_response).unwrap()) } ret.unwrap() } });</code></pre> </div> </div> <div class="paragraph"> <p>A change so simple should have just worked, but even though the playback status of my music player remained the same, my requests kept taking 1 millisecond without any sleep calls. The web developer tools in Firefox quickly showed me the potential problem:</p> </div> <div class="imageblock"> <div class="content"> <img src="/blog/2016-12-21-today-i-learned-4-numbers-in-javascript-considered-useless/json-response-view.png" alt="JSON response view"> </div> </div> <div class="imageblock"> <div class="content"> <img src="/blog/2016-12-21-today-i-learned-4-numbers-in-javascript-considered-useless/raw-response-view.png" alt="Raw response view"> </div> </div> <div class="paragraph"> <p>The JSON response view and the raw response from the server showed different values. OMFG this must be a browser bug showing big numbers&#8201;&#8212;&#8201;let&#8217;s file a bug on Firefox! Just joking, this was not the real problem, but my first suspect was Firefox simply because I&#8217;m using the nightly version.</p> </div> <div class="paragraph"> <p>Long story short: I wasted some nerves and time just to stumble over the same old JavaScript problem again. <strong>Numbers in JS are all IEEE 754 floating point</strong>. Firefox was showing me the correct thing. My Rust-based web server could easily output the exact <code>u64</code> integer value while JavaScript converts to floating point, losing precision and making comparisons and any other use of (big) numbers for my hashing use case totally useless. That means I have to switch to using a string representation of the number instead.</p> </div> <div class="paragraph"> <p>While this is just another <a href="">WAT</a> moment, I am hoping that WebAssembly (supposed to include 64-bit types at some point) and languages that compile to that target can alleviate such problems for the sake of a better future of web development.</p> </div> </div> </div> Sun, 17 Dec 2016 00:00:00 -0000 Giving technical talks — tips to make your listeners happy <div id="preamble"> <div class="sectionbody"> <div class="paragraph"> <p>I&#8217;m not a speaker. Since finishing my Master studies, I never held a technical presentation in front of many people, except for doing lots of company-internal presentations related to tooling, security training and induction. In the last years, I&#8217;ve visited conferences, meetups and smaller presentations and am seeing the same mistakes over and over again. You might ask&#8201;&#8212;&#8201;who am I to give you advice? Obviously I&#8217;m not a well-known speaker, so what <em>do I know?</em> Well, the important point is that I am a <em>good listener</em>, and the quality of a talk is only defined by the perception/reception of its listeners&#8201;&#8212;&#8201;you can believe you&#8217;re the best speaker in the world, but if people don&#8217;t like it, they will 1) typically not give you helpful feedback and thereby not allow you to improve and 2) not come back to your next year&#8217;s talk (or even vote it out of the program). I observed many speakers to learn how to present own topics at a future conference or local meetup, and would like to share my experiences with you.</p> </div> <div class="paragraph"> <p>Here&#8217;s a list of the most common observations of what is going wrong, how to improve, and other helpful tips to just be a better presenter and get a better conversion and perception from your audience.</p> </div> </div> </div> <div class="sect1 dog-blog-breakpoint"> <h2 id="_common_problems_and_hints_in_one_list">Common problems and hints in one list</h2> <div class="sectionbody"> <div class="sect2"> <h3 id="_readability">Readability</h3> <div class="paragraph"> <p>The rhetorical question "Can you all read this, yes?" is almost always answered with a silent mumbling of the audience, which actually expresses "Oh not another dude who cannot create readable slides". And even if we were taught since university, and some even since school (PowerPoint started slowly being allowed in my school era, while it is already "the thing" nowadays), that you should not put too many bullets on a slide and keep the text large and readable, speakers still fail to see their presentation from the eyes of people watching.</p> </div> <div class="ulist"> <ul> <li> <p><strong>Font size and amount of content</strong>: This is the most crucial setting for your slides. It doesn&#8217;t depend as much on the room size as you think, as larger venues are often equipped with large canvases or even mirrored ones for the people in the back. Large font sizes are equally important for <em>any</em> room and <em>any</em> audience. If you don&#8217;t set a reasonable size when starting to work on your slides, you will 1) later have to reorganize your slides on font size increase because the content will not fit anymore, or 2) get the resentment of the audience when having to change it during the talk. The latter case is much worse, and I have seen many speakers use <a href="">reveal.js</a> and other web-based presentation frameworks without understanding how to use them. I even saw a presenter who understood at second zero of his talk that font size was way too small, asked the rhetorical question, tried to use the browser zoom feature, but failed at the attempt because the framework generated HTML that only zoomed controls, not font size. In such a stressful situation, you probably wouldn&#8217;t think of hacking it using Web Inspector to enforce the size change.<br> I can understand that PowerPoint and foes are not very helpful when it comes to syntax highlighting or embedding source code from a file, but yet you have to know what you use and come prepared. For static content, <a href="">LaTeX presentations</a> are a good starting point.<br> In summary: <strong>know your room and presentation target</strong> as you would know your deployment target when developing software. Just because it looks good on your screen does not mean people can read it with a projector (which by the way are usually <strong>4:3</strong> or seldom 16:9). Think of <strong>font size, foreground and background colors, contrast, limit font family and size variations and keep examples and text readable</strong>. That applies to slides, examples and also applications you switch to (terminal, IDE β†’ zoom feature).</p> </li> <li> <p><strong>Colors</strong>: mind the color blind. I admit to have little knowledge around this, but if you tend to distinguish meaning by color, consider using something else instead (bold/italic/underlined text, side-by-side comparison table, multiple slides&hellip; depends heavily on content).</p> </li> </ul> </div> </div> <div class="sect2"> <h3 id="_content">Content</h3> <div class="paragraph"> <p>Quoting Hadi Hariri&#8217;s great talk <a href="">The Silver Bullet Syndrome</a>: a talk should be <em>informative</em>, <em>thought-provoking</em>, <em>entertaining</em> and <em>inspirational</em>. Please have a look (at least) at the first few minutes of the video to understand the terms. I want to give some related advice with real examples, again in no particular order of importance (you decide!):</p> </div> <div class="ulist"> <ul> <li> <p><strong>Hobby projects</strong>: At developer conferences, I noticed that speakers are often a mixture of: 1) experienced speakers who prepare well, probably even held their talk before in a smaller group and chose their topic based on <em>either</em> strong interest for a programming language, technology or standardization, <em>or</em> out of a real (business) use case/issue they encountered. 2) People whose name you didn&#8217;t hear before&#8201;&#8212;&#8201;often those basing their topic and slides around a personal problem statement or hobby project.<br> While a personal topic can be very interesting (I&#8217;m a big fan of lightning talks which have a lot of such topics), some topics are also very boring or useless for an audience that paid to learn about new standards, technologies and practices instead of a hobby project with questionable future, public interest (e.g. GitHub stars) or substantiated problem statement. Before even starting to work on slides or publishing something, check if it may be interesting for others. <strong>Key deliverables</strong> in that kind of presentation could be: real use case, description of other public projects which face the same issue, (your) library/framework to solve the problem statement, proposals for improvement and&#8201;&#8212;&#8201;often forgotten&#8201;&#8212;&#8201;<em>public</em> source code.<br> For one bad example: the latest hype in C++ conferences was functional programming (immutable data structures, monad-like chaining, etc.), and I saw talks centered around the guys' home projects which were advertised to the fullest on their blog with code excerpts, but none of it was ever published. Then on the other hand, functional programming libraries like <a href="">brigand</a> became popular also because they were made public immediately with a request for trying it out, including some good examples.</p> </li> <li> <p><strong>Real examples</strong>: This directly continues on the problems of hobby projects, but applies to all presentations. Without actual examples that people can apply to their own work or personal projects, a talk may not be <em>informative</em> (depends on topic, of course). For myself, I dislike variables named <code>foo</code>/<code>a</code>/<code>b</code>/<code>whatever</code> in examples. Many speakers present a problem that came from a real work problem. In my previous blog posts, I used examples from the financial sector in which I work, for instance. Try to put your real problem statement into a minimal (source code) example, removing all the confidential, over-detailed and useless stuff. You will even find out that if you do so, you may be able to reuse that problem statement as job interview question for software developers!<br> And please, for the love of all we honor as modern software developers, stop using <code>Monkey</code>, <code>Giraffe</code> and <code>Animal</code> as class names. Not even a zoo&#8217;s source code would have such a thing! The only exceptions may be study classes on object orientation and I even admit to have used those names myself on a <a href="">covariance question</a> when I was younger, but please, keep those and other nonsense examples out of technical talks. Show real use cases.</p> </li> <li> <p><strong>Number and size of slides</strong>: It feels sad that people still do this wrong even though it&#8217;s common sense and by just practicing your talk once (even mumbling it to yourself in silence), you can find out that you have too many. I never saw the case of too few slides&#8201;&#8212;&#8201;never! But I saw the opposite&#8201;&#8212;&#8201;a guy presenting way more than a hundred slides in a 60 minutes slot, constantly skipping content that he said was not relevant for the audience. Remember that the <strong>slides exist to guide the audience, not you</strong>, and your <strong>voice and highlighting is there to amend and explain the slides</strong>. There&#8217;s no silver bullet for the ratio of slides per minute, but quite clearly if you find you have to present 2 slides per minute with lots of content or code examples, that simply is not comprehensible at such high speed and your listeners will hate you. In school and university, I learned to put an agenda at the beginning. Even if that is not always very helpful for listeners, it is one way to provide a common thread to <strong>guide through your topic in a reasonable order</strong>.<br> There are many ways to reduce complexity by removing and shrinking content, and thus improve understandability:</p> <div class="ulist"> <ul> <li> <p><strong>Remove uninteresting clutter and images</strong>: Every so often I see people trying to be entertaining with funny images and memes.<br> <span class="image"><img src="/blog/2016-12-17-giving-technical-talks-tips-to-make-listeners-happy/how-about-no.jpg" alt="How about no?" width="300px"></span><br> That&#8217;s okay if your talk is supposed to be funny as (part of) its selling argument (like <a href="">WAT</a> or <a href="">The Silver Bullet Syndrome</a>), but I would recommend not to overdo it. This applies to all kinds of images: photos of famous persons of the 1x-th century who no one can recognize and you kept unlabeled for people to guess (hint: <em>lame!</em>), complicated flow graphs like your manager&#8217;s manager would put in a PowerPoint slide (keep it simple so people can understand!), unrelated side stories if not exactly interesting or amusing (distracts from the common thread).</p> </li> <li> <p><strong>Avoid copy-pasting external resources</strong>: if you have to paste a whole StackOverflow question or answer into your slides, something is wrong with the way you are explaining the problem or solution. Most often, the title or small summary is enough.</p> </li> <li> <p><strong>Short examples</strong>: Code snippets must be to the point, i.e. concisely show the use case, problem or solution. For longer examples, shortly indicate which lines you are going to explain next, e.g. by selecting them or using highlighting features of your presentation software.</p> </li> <li> <p><strong>Inline explanation in your code snippets</strong>: Instead of talking several slides about "how you&#8217;re going to do it" and then show the code which does what you just explained, sometimes you can simply put the relevant explanation into the code or on the same slide. Example: algorithm that transforms matrices for which viewers can relate the single steps with the respective line of the code snippet (<a href="">great example slide of Kris Jusiak</a>, video wasn&#8217;t available yet at time of writing).</p> </li> </ul> </div> </li> <li> <p><strong>Take care of details</strong>: Similar to typos in your examples which make them not compile, other small mistakes such as text typos, half-truths and incomplete explanations may lead to annoyances among the attentive people of your audience. You should be confident that your presentation is good and exact instead of creating it in a hurry.</p> </li> </ul> </div> </div> <div class="sect2"> <h3 id="_listening_comprehension">Listening comprehension</h3> <div class="paragraph"> <p>About making the audience understand what you&#8217;re saying (literally).</p> </div> <div class="ulist"> <ul> <li> <p><strong>Accent in English speaking</strong>: German speakers are a lucky few, because even if they have the typical, horrible accent when speaking English, it&#8217;s one of those which everyone can still understand. However the Germans also have something called "Denglisch", a bad mixture of German and English words, which can lead to misunderstanding. For instance, the German word "Handy" means "mobile phone", and if you mix that into an English sentence, people will be justifiably confused since in English, "handy" means "practical" or "useful". You should <strong>be aware of your own accent and such language traps</strong> and avoid them. Clearly, speaking perfect English is harder for people of certain cultures, but everyone should try their best when talking to an international audience.<br> The typical stereotypes and prejudices about how certain peoples speak English very often hold true, especially if speakers are unaware of how they speak English, so you may be able to just find out something about your culture/language/people by researching (insulting) comments about it. Seriously! To give you a personal example: I googled "german speakers accent horrible" and found e.g. <a href="">Why are Germans among the worst speakers of English?</a>, which felt somehow insulting, but since as an adult I don&#8217;t care much, I read on and found quite good reasoning by the author and others.<br> The difference of languages and cultures is a big source of trouble in listening comprehension and a topic of its own. I might give more hints on that in a future article, but here it is too much.<br> Summary: be <em>aware</em> of how you speak! If you know of your accent, and hear that you cannot avoid it completely, at least try to speak slowly.</p> </li> <li> <p><strong>Do not turn around to your slides on the wall</strong>: you should have a mirrored display (or presentation mode) on your laptop or second screen. You are turning around because you are feeling unsafe&#8201;&#8212;&#8201;similar to putting a hand in your pocket. Knowing your content or at least the order of chapters helps not needing to look up the slide content all the time. <strong>Presentation or mirror mode</strong> can show you the current and/or next slide if you need to see it.</p> </li> <li> <p><strong>Practice</strong>: Many people don&#8217;t like rehearsing their presentation in front of a mirror, with family, colleagues or friends or even by themselves. That&#8217;s natural and often even unnecessary. Regarding stress level: you are not in a job interview (if you are by any chance, ignore this hint), no need to haste or feel unsafe. Think less and keep the presentation style simple and your <strong>voice calm but controlled</strong>. The slower and more calm you are, the lower the risk for increased stress levels are. Watch a few minutes of <a href="">Louis Dionne&#8217;s Meeting C++ 2016 keynote</a> to see the meaning of that hint. He is the best example&#8201;&#8212;&#8201;exaggeratedly calm in some people&#8217;s opinion, but yet doing a perfect presentation with close to zero glitches or mistakes πŸ‘.<br> To get a real preparation for a bigger event, you can <strong>try a local <a href="">meetup group</a>, present at your company or in another small setting</strong>. You could extract a small part of your slides into a lightning talk to check if people like the topic at all. <strong>Request and collect feedback</strong> from the rehearsal/practice session audience.</p> </li> <li> <p><strong>Speak loud and clear</strong>: Some rooms simply have bad acoustics and you can make up for it with your voice. It also makes listeners think that you are feeling confident about your topic. But do not speak too fast&#8201;&#8212;&#8201;as mentioned above, try to keep your pace and style calm.</p> </li> <li> <p><strong>Know your own quirks</strong>: You should be aware of your own behavior and speaking. Practice and feedback can help find out what you didn&#8217;t know yourself yet or didn&#8217;t want to realize. Example: many people say the same words or phrases all the time. It&#8217;s a pity that humanity is so fearful that we do not tell each other, but no way to change that at large. I know there are people who have that as <a href="">invalidity or uncurable illness</a>, but for the ones who can control themselves&#8201;&#8212;&#8201;just listen to yourself to find such tics. It even happens to keynote speakers (<a href="">"you know"</a>). Other popular words to accidentally repeat are "I", "so", "amazing" (or similar), "like" and of course "ummmm".</p> </li> <li> <p><strong>Do not read full slides aloud, separate "inputs" for listeners</strong>: Your audience can mostly only concentrate on one input at a time&#8201;&#8212;&#8201;voice, code, slides, seatmates whispering, other distractions. This is the very reason why your <strong>slides' content should be complementary to your voice</strong>, and therefore avoid too much text on a slide. Summarize in very short sentences if the slide has bullet points, or use emphasized text. If it&#8217;s about code, amend the code on the slide with your spoken explanations. From an "input" point of view, listeners should switch between 1) having the time to look at and understand your code and 2) you explaining it. Obviously not all brains are of the same effective speed and not all listeners an expert in your topic, so there must be appropriate "thinking breaks" in between (remember to talk calmly!). <strong>Provide good, slow example descriptions, and mind beginners and people who are not deep inside the topic.</strong><br> At the conclusion slide, it&#8217;s fine to list things that were already mentioned in short, to repeat them verbally and/or in writing.</p> </li> </ul> </div> </div> <div class="sect2"> <h3 id="_miscellaneous_technical_hints">Miscellaneous technical hints</h3> <div class="ulist"> <ul> <li> <p><strong>Use latest tools</strong>: In one particular presentation of Meeting C++ 2016, I saw a C++11 implementation of what was already available in C++17 as built-in feature. While all other speakers were already showing off trunk compiler features which would soon be shipped as implementation of the new 2017 standard, I had to read slides which proved not only that it can already be done now, but also that I have to use ridiculous templating tricks and go through 50 copy-pasted lines of code on one slide just to get a point across that did not help the actual example, but which would also become obsolete in only a few months.<br> If you don&#8217;t want to make the effort of building/installing the latest compiler, just use online tools like <a href="">compiler explorer</a> (but mind that you may be offline during the presentation).</p> </li> <li> <p><strong>Consistent examples</strong>: Try to keep your examples centered around one topic. Be it one use case, one other programming language to compare to, and so on. Try to keep the variation low to guide watchers, not confuse them. One real-life example I experienced was a talk loosely related to functional programming which pointed out some wildly mixed examples in both Python and Haskell, a combination which was obviously quite unfamiliar to most of the (C++) audience, especially given Haskell&#8217;s syntax which is not immediately comprehensible.</p> </li> <li> <p><strong>Working examples</strong>: Make sure your examples actually compile. A great way to do so is to add a <code>Makefile</code> to your slides repo, and embed the source files into your slides, instead of copy-pasting examples into slides directly. Provide a precompiled header to remove all the clutter (e.g. for C++: includes, <code>using namespace</code> statements, repeated example types), so that only the relevant part is imported from the source file into the slide. Having your examples compile allows you to demo and adapt your use cases if someone has a question.</p> </li> <li> <p><strong>Display your keystrokes (topic-dependent)</strong>: If you are showing an IDE or anything else where shortcuts, pressed keys and clicks help the understanding, use a tool to display what you are typing.</p> </li> <li> <p><strong>Do not disturb</strong>: Nothing is more annoying than notifications and applications popping up while you present. You have to say "sorry" and probably it is even something embarrassing like a chat or e-mail notification including content. Take measures to avoid such a situation:</p> <div class="ulist"> <ul> <li> <p>Close or mute browser tabs which could show notifications (such as WhatsApp Web or Facebook), or temporarily disable them (Firefox: open <code>about:config</code> and search for <code>dom.webnotifications.enabled</code>).</p> </li> <li> <p>Enable "do not disturb" mode in your presentation tool or operating system. For macOS, open the notification center (right-most menu button), scroll up and enable "Do not disturb". You can even auto-enable it by opening "System Preferences &gt; Notifications &gt; Turn on Do Not Disturb &gt; When mirroring to TVs and projectors". Windows 10 users have a setting "System &gt; Notifications &amp; actions &gt; Hide notifications while presenting" (which the software must support). And so on.</p> </li> <li> <p>Close unnecessary applications, especially the ones which show custom style notifications (and thus don&#8217;t react to no-distraction settings as mentioned above) or change the screen color. Popular examples are <a href="">f.lux</a> (adapts color temperature based on daytime) or <a href="">Time Out</a> (forces you to take a break).</p> </li> <li> <p>Do the same on your phone, or set it to silent or vibration mode. Tell your partner not to call you while presenting.</p> </li> </ul> </div> </li> </ul> </div> </div> <div class="sect2"> <h3 id="_hardware">Hardware</h3> <div class="ulist"> <ul> <li> <p><strong>Internet access</strong>: Hopefully, you&#8217;re fully prepared without requiring any online resources. Wi-Fi access is often a pain or sometimes even unavailable, so try to open websites beforehand or store resources on disk. The same applies to software and packages that you need to have installed. If a Wi-Fi is available, connect early enough and consider a temporary WLAN hotspot of your phone&#8217;s mobile connection as backup.</p> </li> <li> <p><strong>Display/projector problems</strong>: Conference organizers can bring tons of cables and adapters, but still we can see problems in 2 out of 10 presentations. Before one particular talk, I observed 6 software engineers and 2 venue assistants trying to get to one out of two laptops to work with the projector&#8201;&#8212;&#8201;for over 10 minutes. Organizers should offer the speakers to try that out before, and if so, take that chance.</p> <div class="ulist"> <ul> <li> <p><strong>Bring a USB backup of your slides</strong> so in case of unsolvable problems (such as output with wrong colors, green stripes, graphics chip or projector related), the organizers can give you an alternative laptop to present with. This means you should be prepared to hold the presentation on another machine, if possible. If you want to run code examples, for instance, try to have a portable environment ready, such as prebuilt binaries or a pre-installed Python virtualenv. <strong>Use common formats</strong> like PDF or PowerPoint because those are most likely to be openable on spare laptops if yours is not working.</p> </li> <li> <p><strong>Bring your own adapter</strong>: Especially for Apple machines, often a funny mixture of adapters is necessary. Some lonely projectors out there are still VGA-only. Chances are that soon young people do not even know what VGA or analog display means&hellip;</p> </li> </ul> </div> </li> <li> <p><strong>Try a presenter mouse</strong> if you are the type of person to walk around, or the room setup forces you to stand far from your laptop ("Can I have the next slide, please?"). The model should be easy to handle with a fixed grip and have all the features you need, e.g. next/previous slide, start presentation, laser pointer and probably extended features like media controls, right click, touchpad for cursor moves and whatnot. A regular mouse can serve as stupid backup if things do not work on the spot. Try out the presenter mouse before the presentation to get to know how to use it (and potentially install required drivers while you are still online). I&#8217;m using an older version of <a href=";qid=1481920224&amp;sr=8-1&amp;keywords=logitech+presenter">this Logitech model</a> which serves me well with the most basic features.</p> </li> <li> <p><strong>Laser pointers may not show up</strong> on mirrored canvases&#8201;&#8212;&#8201;for instance, talks with large audiences are sometimes in a venue that has one main projector on stage, and more projectors for people in the back). The small dot could also be invisible in the video recording. <strong>Alternatively, select text with your cursor</strong>, which is easy for PDF and HTML-based slides. Other tools like PowerPoint include markup tools to highlight text. <strong>Or use the old-school "one bullet point at a time" feature</strong> to reveal only the passage or code snippet you are currently talking about, to allow people to focus on the right spot of your slide.</p> </li> </ul> </div> </div> </div> </div> <div class="sect1"> <h2 id="_conclusion">Conclusion</h2> <div class="sectionbody"> <div class="paragraph"> <p>Naturally, the above list cannot be comprehensive. The items are the ones I found important for me as the listener of a talk. I know many of them might sound very meta or hard to achieve (like changing your voice, speed or stress level), but only by knowing about what could go wrong, you get a chance of improving (as in: "if nobody tells you about your bad breath, you can never change it"). Please comment if you have more important things to add or want to give feedback, no matter if from the perspective of speaker or listener!</p> </div> </div> </div> Sun, 14 Nov 2016 00:00:00 -0000 Names are important – improving use of terms in software engineering <div id="preamble"> <div class="sectionbody"> <div class="paragraph"> <p>In our field, few things are more important than reading code, which&#8201;&#8212;&#8201;except for one-man army companies&#8201;&#8212;&#8201;involves numerous developers reading and trying to understand the same code. One code base is read way more often than written or refactored (if not, you&#8217;re doing it wrong), hence the importance of a common understanding of the <em>terminology</em>. Herein I want to present challenges with examples, the different types of scope applying to a term and tips to improve on your use of terminology to foster better communication within companies and elsewhere.</p> </div> </div> </div> <div class="sect1 dog-blog-breakpoint"> <h2 id="_examples">Examples</h2> <div class="sectionbody"> <div class="paragraph"> <p>Terms can occur at various semantic locations in code. Here are some examples (some real, some contrived) from the financial sector in which I&#8217;m working. They language assumption is C++ in this article, but the recommendations can be applied equally to other languages.</p> </div> <div class="ulist"> <ul> <li> <p><strong>Type names</strong> (class/struct/enum/interface depending on programming language): <code>Account</code></p> </li> <li> <p><strong>Function names</strong>: <code>validateIban</code></p> </li> <li> <p><strong>Method names</strong>: <code>getBankList</code></p> </li> <li> <p><strong>Module and namespace names</strong>: <code>Billing::Aggregation</code></p> </li> </ul> </div> </div> </div> <div class="sect1"> <h2 id="_terminology_scopes">Terminology scopes</h2> <div class="sectionbody"> <div class="paragraph"> <p>Even if you&#8217;re working in the same area, you might not immediately understand each of the example terms above in the way they are used in code and conversations in my company. The reason is that different <strong>levels and types of scope</strong> apply.</p> </div> <div class="sect2"> <h3 id="_global_scope">Global scope</h3> <div class="paragraph"> <p>Globally familiar words are often clear by themselves, and typically can be looked up in an English dictionary without having more information on the context. The dictionary should usually give no ambiguities. One example is <code>Billing</code>, which expresses that the topic is centered around bills in some sense. If all code had globally intuitive words only, our understanding would be perfect! Reality is that there&#8217;s almost always a context which describes (required) details.</p> </div> <div class="paragraph"> <p>Exceptions are truly global names. Think of public trademarks and product names known worldwide. These don&#8217;t need explanations anymore, but still the companies behind them have to ensure that the understanding does not get altered through the years. Some people wish for the term "Java" to only relate to an island&hellip; A good example is "Microsoft Office". It is familiar all around the globe even for non-tech people (keep this hint in mind).</p> </div> </div> <div class="sect2"> <h3 id="_local_scope_field_of_expertise_company_team_project_module_etc">Local scope (field of expertise, company, team, project, module, etc.)</h3> <div class="paragraph"> <p>Even if <code>Billing</code> is easily comprehensible, the example namespace <code>Billing::Aggregation</code> is probably gibberish to someone who is not from the financial sector, or even new hires who don&#8217;t know yet how the company handles bills&#8201;&#8212;&#8201;namely, in this example, by aggregating some key figures. The word may therefore as well be very specific to the company, with a different understanding in other businesses.</p> </div> <div class="paragraph"> <p>A local context also applies to the method name <code>getBankList</code>. Without extra information like a documenting comment, the signature is not enough to find out 1) what kind of list this is, 2) if "get" means downloading, retrieving via remote call or parsing from file, and so on. At best, the surrounding class/module/project is clear enough to the reader of the code to understand the concept later, or provides a clarifying unit test or sample input.</p> </div> <div class="paragraph"> <p>Terms may even be team-specific (or evolve so over time) by various reasons: access to sensitive code may be restricted, the software company grows and splits up teams, topics of teams are not interrelated and there are no intersection points for shared code/guidelines/rules, &hellip;</p> </div> <div class="paragraph"> <p>The worst case is when the scope gets so narrow that, earnestly, "it&#8217;s all in the code" (only). Below, I will list some recommendations for choosing names, and related tips, to not get to this point (I call it "people lock-in", as in "vendor lock-in").</p> </div> </div> <div class="sect2"> <h3 id="_ambiguous_contexts">Ambiguous contexts</h3> <div class="paragraph"> <p>The aforementioned "local scope" is one context for a term. If one word can be understood in different ways, it is contextually ambiguous. You don&#8217;t want to collect many such terms in your code base, or else switching to another project will leave you confused why <code>accountName</code> suddenly means "human description of bank account" for at another point it meant "username of login credentials".</p> </div> <div class="paragraph"> <p>Likewise, even if terms have a unique meaning across the code base, they may be represented as different types and therefore again cause an inconsistent understanding. For example, let&#8217;s assume that "account" is always a bank account, but once there&#8217;s a variable <code>AccountInfo account</code> and in another spot <code>int64_t account</code>. "The latter is quite obviously the account ID in the database!"&#8201;&#8212;&#8201;oh no sorry, Mr. Original Code Author, it&#8217;s not obvious!</p> </div> <div class="paragraph"> <p>Another popular piece of information stored in data structures is <code>string address</code> (compatible with all international addresses! 🌍) being represented differently elsewhere: <code>struct Address { string address1; string houseNo; [&#8230;&#8203;] }</code>. These are even incompatible in conversion.</p> </div> <div class="paragraph"> <p>The key to overcome ambiguities is to stay consistent in naming. In the majority of cases, the main term (here: "account"), which by itself is not explanatory, can be suffixed to give a meaningful variable name: <code>int64_t accountId</code> and <code>AccountInfo accountInfo</code> (still nicely readable if type is omitted: <code>const auto accountInfo = [&#8230;&#8203;];</code>).</p> </div> </div> <div class="sect2"> <h3 id="_implementation_detail_scope">Implementation detail scope</h3> <div class="paragraph"> <p>While <code>validateIban</code> very obviously validates IBANs, knowing only the function name doesn&#8217;t say anything about how the function works. It requires at least the function signature and possibly a documentation comment to grasp the semantics. A company or development team may have their concept of how all <code>validateXYZ</code> functions should work, e.g. throw specific exception or return false on invalid value, and even if that concept is "well-known", it&#8217;s a notion that must be transferred to new hires. Such an induction to the company&#8217;s development practices is of course necessary for new developers, but too much will overload those people, resulting in small details being forgotten. Let&#8217;s say you forgot what the function <code>validateIban</code> returns for an empty input string? It&#8217;s a very important detail, and the most sane way would be to consider an empty value as invalid, because then the caller can decide whether an empty/optional value is allowed depending on use case. Yet this detail is not found in the name (granted, it&#8217;s hard in this case without getting wildly over-verbose names).</p> </div> <div class="paragraph"> <p>Here are a few alternative function signatures (C++):</p> </div> <div class="ulist"> <ul> <li> <p><code>auto validateIban(const std::string& s) -> bool;</code>&#8201;&#8212;&#8201;this suggests to the reader that the function does not throw and returns whether the input is valid or not. It does <em>not</em> say what happens in the case of empty string, but as stated above, this could just be left off because there&#8217;s a sane default behavior. Nevertheless, following the "verbSubject" naming principle, a better signature would be <code>[[nodiscard]] auto isValidIban(const std::string& s) -> bool</code> which makes it even clearer that the function doesn&#8217;t throw but returns a boolean. Developers don&#8217;t even have to read the full signature to use it correctly, and are warned (starting with C++17) if the return value is unused by mistake.</p> </li> <li> <p><code>auto validateIbanOrThrow(const std::string& s) -> void;</code>&#8201;&#8212;&#8201;the void result type and "OrThrow" suffix in the name makes it totally clear that the function will throw on invalid input. Whether you include the type of exception in the signature or name is a question for your style guide (e.g. <code>template&lt;typename TExc&gt; &#8230;&#8203;</code> to make it explicit). Personally, I&#8217;d just throw a standard exception here (<code>std::invalid_argument</code>), and stay consistent in similar functions.</p> </li> <li> <p>No function at all. Use <a href="/blog/2016-11-04-today-i-learned-3-strong-typing-in-cpp-vs-rust">strong typing</a> to ensure that at the relevant spots, only valid IBAN arguments can be passed in (i.e. <code>auto extractAccountNumberFromIban(const ValidIban& iban) -> std::string;</code>). Along the same line, introduce the practice to validate inputs at the input boundary (e.g. remote call), not just later where you could forget calling <code>validateIban</code> by accident. This will also improve your error handling because you will fail earlier, and can write functions that make assumptions about their inputs and thus may even become exception-free. As mentioned in the linked post, using strong types is probably overkill if done <em>throughout</em> your code, so this is a also something for the style guide, or to decide per case.</p> </li> </ul> </div> </div> </div> </div> <div class="sect1"> <h2 id="_factors_other_than_context">Factors other than context</h2> <div class="sectionbody"> <div class="paragraph"> <p>Surely the context defines to which domain a term belongs. Nevertheless other influences can help determine whether a name or term makes sense to use.</p> </div> <div class="sect2"> <h3 id="_complexity_and_language">Complexity and language</h3> <div class="paragraph"> <p>The first influences I want to summarize here are complexity and (spoken) language. International mixes of development team members can be found in almost all companies. The language and culture barrier and gap are the most influential topics to be aware of when it comes to creating a common and mutual understanding of technical and personal themes. Hence it&#8217;s no wonder that the English language reigns software development both in coded and spoken words. To account for culture differences, complex English vocabulary should be banned where reading or listening comprehension is important.</p> </div> <div class="paragraph"> <p>To give an example, I want to name the Unicode standard. After reading 20+ articles (incl. the famous <a href="">Joel Spolsky</a>) about Unicode in the last 15 years, and continuously learning about its updates, its terminology still is only partially burned into my brain. The sheer count of terms is high, but in my opinion not the issue, since the memory of a software developer is quite durable once a term is clear. Can you distinguish UTF-8, UCS-2, UCS-4, UTF-16{BE,LE}, UTF-32, (UTF-7), character, glyph, character set, code point, surrogate pair, BMP, BOM, U+1F4A9? I have no problem recalling their meaning when I see them, but what really makes my brain smoke are the non-technical things mentioned in that list: is a glyph a fully rendered code point, or a partial symbol? How did they define character again&#8201;&#8212;&#8201;was it the same as a code point? Just look for a minute at their <a href="">glossary</a> and you&#8217;re going to be overwhelmed as well. In summary, the standard, the related myriad of blog posts, true/half-assed/false answers on StackOverflow and other resources are simply an overload for the software industry and simplifying now takes a huge amount of effort. If we were to use <a href="">UTF-8 everywhere</a> (great simplified glossary there!) already 20 years ago, there probably wouldn&#8217;t be crazy inventions like MySQL&#8217;s UTF-8 variants (yes, your UTF-8 enabled database probably cannot store all of Unicode!):</p> </div> <div class="quoteblock"> <blockquote> <div class="paragraph"> <p>For a supplementary character, <code>utf8</code> cannot store the character at all, whereas <code>utf8mb4</code> requires four bytes to store it.</p> </div> </blockquote> </div> <div class="paragraph"> <p>See, complexity and amount of terminology is like a growing company&#8201;&#8212;&#8201;the smart ones can handle growth easily by keeping things simple and stupid, while the typical response to growth is levels of management, performance reviews, more business, less "family" feeling, or in other words: complexity.</p> </div> </div> <div class="sect2"> <h3 id="_ambiguous_wording">Ambiguous wording</h3> <div class="paragraph"> <p>Imagine you&#8217;re in one well-defined context, have chosen simple English words that need no explanation in your opinion, developers you ask tell you they understand the meaning immediately&#8201;&#8212;&#8201;what could possible go wrong? You&#8217;ve landed a set of terms to be carved in stone. They will call a dictionary after you! Well, probably not&hellip; In reality this is long before the finish line.</p> </div> <div class="paragraph"> <p>One area for which I really have a strong opinion are filesystem terms. Those are around since ages but still confused and forcefully en-ambiguated (my opposite of disambiguated) or highly confused in code all the time, to the point where it&#8217;s not funny anymore. The problem is that even if the words are clear, and you were given Tanenbaum&#8217;s book on operating systems in studies class, the terms are still way too interchangeable. Find below some examples of ambiguous wording, including my proposals and what people also use as alternatives. I&#8217;m using lowerCamelCase examples here to also nitpick about spelling differences. This was the motivation to start writing this blog post, so sorry about the lengthy commentary! I&#8217;d like to hear comments on this admittedly very opinionated section:</p> </div> <div class="ulist"> <ul> <li> <p><strong>file, f, path, p, filepath, filePath, filename:</strong> In operating system terms, a "file" can be a regular file, symlink, hard link, socket, FIFO, other special types or a directory. Often it is perfectly fine to use the terms "file" and "directory" to tell (regular) files apart from directories. Just think of the famous error message "No such file or directory". Usage depends a little on the use case, but mostly readers of code will simply understand because it is clear that a file with content is being read, or a directory is listed, for instance.<br> <em>But:</em> <code>file != path != filePath != filename</code> ☝️. First of all, "filepath" is a spelling that nobody uses, so you also shouldn&#8217;t, while "filename" is funnily the typical spelling (not "fileName"), just like "filesystem" exists in some dictionaries alongside "file system" (I don&#8217;t have a preference there). A "file path" is a path that points to a (optionally existing) file, and is mostly used in code to mean a regular file (or transparently a regular file behind a symlink). The difference to a "path" is that the latter means it can point to any file type on the system, including a directory. Using the variable name <code>path</code> therefore is probably underspecified and not a good idea if the intention is specific. Using <code>p</code> alone as variable name is much worse than the familiar abbreviations <code>f</code> (to represent a file handle) or <code>i</code> (for loop indices).<br> Moreover, people don&#8217;t seem to get the difference between filename and file path. A "file name" is the name of a file entry (mostly within a directory, but without exposing that context), e.g. "Hello.cpp", while its path may be any path pointing to that file, e.g. "/tmp/Hello.cpp" or "C:\SuperSource\Hello.cpp" (absolute paths), or "../../private/tmp/Hello.cpp" or&#8201;&#8212;&#8201;equaling the filename&#8201;&#8212;&#8201;"Hello.cpp" (relative paths).<br> Last, if I were to see a variable called <code>file</code>, in C++ I&#8217;m most likely to guess that it&#8217;s a file input stream, while many people use that name in place of a file name or path, which is greatly misleading and semantically wrong. This is a case for a naming guideline, since different opinions exist, and it&#8217;s also slightly dependent on the programming language&#8201;&#8212;&#8201;in Python, I would use <code>f</code> for an input file stream and <code>out</code> or <code>out_file</code> for a writing stream, while in other languages such short variable names are unusual.</p> </li> <li> <p><strong>directory, dir, dirPath, folder:</strong> In my memory, it was mostly Microsoft coining the term "folder". Wikipedia <a href=";oldid=745531912#Folder_metaphor">explains</a> that a "folder" is just the graphical metaphor that represents a directory on the filesystem, and that e.g. Windows has special folders (like "Photo library") that don&#8217;t map directly to a directory on disk. Therefore in code, the correct term is almost always "directory" or an abbreviation (<code>dir</code>). Unlike <code>file</code>, the variable name <code>dir</code> by itself says even less about its meaning: unless you&#8217;re working with directory handles, you couldn&#8217;t infer what <code>dir</code> should stand for, and if it might represent an absolute directory path, or something else. So often times, this had better be <code>dirPath</code>, or if the variable name includes the meaning (it should!), I&#8217;m tempted to omit the <code>*Path</code> suffix: <code>bankStatementsDownloadDir</code>.</p> </li> </ul> </div> </div> </div> </div> <div class="sect1"> <h2 id="_recommendations">Recommendations</h2> <div class="sectionbody"> <div class="paragraph"> <p>In no particular order:</p> </div> <div class="ulist"> <ul> <li> <p><strong>Simple English:</strong> Use vocables that are taught internationally and <strong>resolve to one clear meaning when looked up in a dictionary</strong>. You should not even have to look it up. It starts at easy terms like "replace" instead of "substitute", and continues to native level complexity (missing reasonable bad examples here, sorry), or even to words that are only understood in certain English-speaking countries.<br> Code that reads like English sentences is often the best choice for later comprehension.</p> </li> <li> <p><strong>No code names</strong>. Made up words and names, or acronyms, can be a nice memory or story behind a project, but should not leak into the writing of its source code. Stay with clear English wording that other people can grasp.<br> Also: prefer short names before abbreviations&#8201;&#8212;&#8201;please stay away from stupid acronyms and be smarter than governments, research institutions and armies who use letter abbreviations everywhere. Example: <a href="">STYLE</a> = "Strategic Transitions For Youth Labour in Europe"&#8201;&#8212;&#8201;you gotta be kidding me!</p> </li> <li> <p><strong>Comprehensible by non-techies:</strong> If terms are important and publically visible for other departments or consumers, name them accordingly. "Billing aggregated information per merchant" is much better than "Merchant tx sums" (totally contrived πŸ˜‰). "Microsoft Office" is much better than "Humble Write Bundle".<br> I could write a whole book about this item done wrong in public-facing user interfaces and applications. Assume Google sent you an e-mail "Login from unknown IP abcd:beef:1234:::1 with device supermario". Now <strong>estimate how many of the people in your neighborhood would react</strong> to such a mail, or even know what an "IP" is (or IPv6)? In reality, Google is much smarter, and the title for an unknown login alert currently reads "Someone has your password". While this could also be a spam subject, the average tech user is much more likely to react to clickbait titles warning about a virus or stolen password than to titles they don&#8217;t understand. <strong>No technical details</strong> like IP, device name or location are shared by Google&#8217;s alert (only after the click), but instead there&#8217;s a <strong>single, fat button</strong> "REVIEW YOUR DEVICES NOW". This far it&#8217;s wonderful naming and perfectly smart design to attract people to security measures&#8201;&#8212;&#8201;an outstanding example.</p> </li> <li> <p><strong>Maintain a technical glossary page or a good practices project</strong>: Create a Wiki or intranet page for developers to look up commonly used terms. You could even add the recommended variable name(s) in there for important concepts. Don&#8217;t pack too many words in there and <strong>don&#8217;t grant other (non-technical) departments write access</strong> because else they might quickly pile up half-true or unrelated descriptions of things that developers don&#8217;t even need to know, or must have a much deeper technical understanding of. If you&#8217;re one of those "our Wiki is always outdated" or "our Wiki is write-only" companies, you could instead "appoint" a best practices code project, so to say a <strong>flagship project that does most things (including naming) right and consistently</strong>. Newcomers should learn good practices from that project. In my team at work, for example, we develop implementations for many payment methods (e.g. Credit Card or PayPal are payment methods) based on the same module interface, so implementations only (need to) vary slightly in their overall logic and naming concepts. We implicitly know which projects are the ones we wrote this year, and as such are the ones where we applied the most modern practices and conventions to stay consistent or introduce better terminology. These latest projects can be seen as starting point for any new project. In addition, we have a Wiki page outlining important points to consider for these similar implementations&#8201;&#8212;&#8201;much like a checklist (not related to terminology per se, just as general hint).</p> </li> <li> <p><strong>Provide examples:</strong> If there is a core spot where a term stems from or which defines the main usage, for instance a module that parses the important company report called "monthly aggregated Blobby Volley results and player of the month", that code repository probably should contain a <strong>relevant unit test and sample file/input</strong> where reviewers can later look up what makes up this report (can be anonymized data), how its output would look like, and probably a short explanation of its meaning for the company. Alternatively, I imagine <strong>explanatory articles on the company Wiki</strong>, <strong>structured</strong> in reasonable order/hierarchy of topics, and <strong>linked in the glossary</strong>.</p> </li> <li> <p><strong>Ambiguous meanings:</strong> In many cases, code and terminology grew historically and you can&#8217;t easily change names anymore&#8201;&#8212;&#8201;accept the fact and try to disambiguate as far as possible. If a term "account" is ambiguous between two projects, let&#8217;s say project A ("LoginService", where it stands for login credentials) and B ("BankAccountService", here it represents bank account information), then <strong>ensure the ambiguous term doesn&#8217;t slip from project B into A</strong>, and vice versa.<br> <strong>If both meanings need really be mixed</strong> within one code repository, use namespaces, type and variable name prefixes or suffixes to overcome the ambiguity: <code>loginAccountInfo</code> and <code>bankAccountInfo</code>. Before introducing terms, look which ones already exist, or else you won&#8217;t be able to disambiguate easily&#8201;&#8212;&#8201;for instance, Rust&#8217;s package manager cargo uses the word "target" publicly for both "build target" (as in <code>make &lt;targetname&gt;</code>) and for "target architecture" (alias platform triple, e.g. "x86_64-unknown-linux-gnu"), which is mostly clear in the code because internally it&#8217;s most often called "platform", but the slight annoyance remains existent because the public configuration key <a href="">is still called <code>target</code></a> and will remain so for a long time to stay backward-compatible.</p> </li> <li> <p><strong>Use one consistent name and stop the typos</strong> already to make code grep-able. This allows to search through the whole code base and see where a term or type/variable name is actually in use. If you consistently used <code>accountInfo</code> for all places where you store a local variable about bank account information, you can more easily rename all places to the new desired name <code>bankAccountInfo</code>. Side note: in reality, renames tend to be a bit less trivial, though. The same applies to sentences such as public error messages: if they are all identical, or even in one shared linked library, it&#8217;s easy to fix/amend/replace/gettext-translate them.</p> </li> <li> <p><strong>Ensure a given name is clear within the desired scope:</strong> If you have a method <code>getBankList</code>, you should make sure that the parent class describes what it is about&#8201;&#8212;&#8201;e.g. <code>DeutscheBundesbankXmlBankListParser</code> is a little exaggerated but clearly says it parses the XML bank list of the German federal bank. The bigger the scope is, the more important good naming is for types and items that lie within. Imagine this class was part of a shared library that you&#8217;re selling to customers!</p> </li> <li> <p><strong>Function names should be verb-followed-by-subject</strong> where a reader should be able to infer the output from the verb ("validate" in our example was not helpful).</p> </li> </ul> </div> <div class="paragraph"> <p>I hope this list proves helpful to see terminology from a different perspective and allows you to take action enhancing your practices and sweeping out old, nonsense names from your code.</p> </div> </div> </div> Sun, 04 Nov 2016 00:00:00 -0000 Today I learned — episode 3 (strong typing in C++ vs. Rust) <div id="preamble"> <div class="sectionbody"> <div class="paragraph dog-blog-hidden-in-overview"> <p>This blog series is supposed to cover topics in software development, learnings from working in software companies, tooling, but also private matters (family, baby, hobbies).</p> </div> </div> </div> <div class="sect1"> <h2 id="_strong_typing_in_rust_and_comparison_to_c">Strong typing in Rust and comparison to C++</h2> <div class="sectionbody"> <div class="paragraph"> <p>C++ enthusiast Arne Mertz recently wrote a post <a href="">Use Stronger Types!</a>, a title which immediately sounded like an appealing idea to me. Take a look at his article, or for a tl;dr, I recommend looking at the <a href="">suggestion of strong typedefs</a> and links to libraries implementing such constructs/macros.</p> </div> <div class="paragraph"> <p>My inclination towards the Rust programming language and own expertise in related C++ constructs (and attempts to use stronger typing in work projects) commanded me to research how the languages compare and what other simple options exist. Matter of fact, I&#8217;m going to present below some findings that I already had prepared for a draft presentation which compares C++ with Rust (with the goal of finding out where C++ could improve). This article explains possible alternatives in C++, a suggested solution that is very explicit, and how one can achieve something similar in Rust.</p> </div> <div class="sect2 dog-blog-breakpoint"> <h3 id="_terminology_for_code_samples">Terminology for code samples</h3> <div class="paragraph"> <p>As I&#8217;m working for payment service provider <a href="">PPRO</a>, my examples come from the financial sector. Let me quickly introduce a few relevant terms.</p> </div> <div class="paragraph"> <p>The term <strong>PAN</strong> essentially means a credit card number, where the <strong>full PAN</strong> may never be stored on disk ("at rest") without encryption (such as in a log file) or leave the protected environment, and has many more <a href="">security restrictions</a> demanded by the <a href="">PCI-DSS data-security standard</a> (PCI = Payment Card Industry). <strong>Masked PANs</strong> are the ones that can be displayed outside of a PCI environment. For example, if you purchase a product with your credit card (number 1234569988771234), that number may be stored in an encrypted form within a PCI-compliant environment. However it may only leave that environment in <strong>masked PAN</strong> form, that is, at most the first six digits (BIN = bank identification number) and the last four digits (non-identifying fraction of the customer&#8217;s card number). At your next purchase at the merchant, they can offer you to pay with the same card again (displaying the masked number 123456XXXX1234).</p> </div> </div> <div class="sect2"> <h3 id="_c_code_typedef_code_is_em_not_em_strong_typing">C++ <code>typedef</code> is <em>not</em> strong typing</h3> <div class="paragraph"> <p>We want to ensure that full PANs cannot be converted directly to masked PANs (among other reasonable restrictions). Look at a beginner attempt to define separate types:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-cpp" data-lang="cpp">#include &lt;cstdint&gt; #include &lt;iostream&gt; #include &lt;string&gt; typedef std::string MaskedPan; typedef std::string FullPan; // Assuming PANs never start with a zero, we can stuff them into an integer type typedef uint64_t MaskedPanU; typedef uint64_t FullPanU; int main() { { FullPan full = "1234569988771234"; MaskedPan masked = full; std::cout &lt;&lt; "Masked (string): " &lt;&lt; masked &lt;&lt; std::endl; } { FullPanU full = 1234569988771234; MaskedPanU masked = full; masked += full; // even this works std::cout &lt;&lt; "Masked (integer): " &lt;&lt; masked &lt;&lt; std::endl; } }</code></pre> </div> </div> <div class="paragraph"> <p>Well, that compiled just fine and led to a fatal bug&#8201;&#8212;&#8201;we just logged a full PAN to stdout. That action shouldn&#8217;t have been possible. You don&#8217;t want to sit through and pay for those two extra weeks in the next credit card audit, not to mention the cleanup to get the sensitive data out of the way!</p> </div> <div class="paragraph"> <p>Using an <code>enum</code> wrapper is also ugly, and not really readable like English prose&#8201;&#8212;&#8201;so probably not a good idea in general.</p> </div> <div class="paragraph"> <p>The C++ standard gives a simple explanation:</p> </div> <div class="quoteblock"> <blockquote> A typedef-name is thus a synonym for another type. A typedef-name does not introduce a new type the way a class declaration (9.1) or enum declaration does. </blockquote> </div> <div class="paragraph"> <p>Or in other words, <code>typedef A B;</code> seems to be no different in this use case from <code>using B = A;</code>&#8201;&#8212;&#8201;if there&#8217;s a difference at all?! Fortunately right in that quotation we have a proposed solution: declare a new type with <code>struct/class/enum</code>.</p> </div> </div> <div class="sect2"> <h3 id="_c_strong_typing_with_code_enum_code">C++ strong typing with <code>enum</code></h3> <div class="paragraph"> <p>While I wouldn&#8217;t recommend using an <code>enum</code> for scenario, it apparently has its strong typing benefits:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-cpp" data-lang="cpp">#include &lt;cstdint&gt; #include &lt;iostream&gt; #include &lt;string&gt; enum class MaskedPanE : uint64_t {}; enum class FullPanE : uint64_t {}; auto maskPan(FullPanE full) -&gt; MaskedPanE { // Take first six and last four digits of full (optionally test that full // is at least 10 digits if not guaranteed by other code) const auto fullPan = std::to_string(static_cast&lt;uint64_t&gt;(full)); const auto maskedPan = std::stoull(fullPan.substr(0, 6) + fullPan.substr(fullPan.size() - 4)); return static_cast&lt;MaskedPanE&gt;(maskedPan); } int main() { FullPanE full = static_cast&lt;FullPanE&gt;(1234569988771234); // Now we have strong typing :) This gives // error: cannot convert 'FullPanE' to 'MaskedPanE' in initialization // MaskedPanE masked = full; MaskedPanE masked = maskPan(full); // Additional benefit: outputting only possible with explicit cast std::cout &lt;&lt; "Masked (enum): " &lt;&lt; static_cast&lt;uint64_t&gt;(masked) &lt;&lt; std::endl; }</code></pre> </div> </div> </div> <div class="sect2"> <h3 id="_c_strong_typing_of_strings">C++ strong typing of strings</h3> <div class="paragraph"> <p>The <code>std::string</code> case actually is a no-brainer: strings are ubiquitous in business logic of most companies. They are used for</p> </div> <div class="ulist"> <ul> <li> <p>money amounts (different format, decimal and thousand separator, rounding, precision)</p> </li> <li> <p>file paths, filenames</p> </li> <li> <p>numbers</p> </li> <li> <p>binary data, but also text of varying encodings (C++ and Unicode is a different story altogether 😜)</p> </li> <li> <p>data types which are incompatible or should be semantically distinct (such as full and masked PANs in our scenario)</p> </li> <li> <p><em>maaaaaany</em> more use and abuse cases all around the globe</p> </li> </ul> </div> <div class="paragraph"> <p>We have to create a new <code>struct</code> or <code>class</code> to have disjoint string-based data types.</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-cpp" data-lang="cpp">#include &lt;cstdint&gt; #include &lt;stdexcept&gt; #include &lt;iostream&gt; #include &lt;string&gt; class StringBasedType { const std::string _s; public: explicit StringBasedType(const std::string&amp; s): _s(s) {} auto str() const -&gt; const std::string&amp; { return _s; } }; // Randomly using `struct` keyword here, could as well be `class X: public StringBasedType` struct MaskedPan: StringBasedType { explicit MaskedPan(const std::string&amp; s): StringBasedType(s) { // Check input value: require 123456XXXX1234 format if (s.size() != 14 || s.substr(0, 6).find_first_not_of("0123456789") != std::string::npos || s.substr(6, 4) != "XXXX" || s.substr(10).find_first_not_of("0123456789") != std::string::npos) { throw std::invalid_argument{"Invalid masked PAN"}; } } }; struct FullPan: StringBasedType { explicit FullPan(const std::string&amp; s): StringBasedType(s) { // Check input value based on assumptions if (s.size() &lt; 13 || s.find_first_not_of("0123456789") != std::string::npos) throw std::invalid_argument{"Invalid full PAN"}; } auto getMasked() const -&gt; MaskedPan { const auto&amp; s = str(); // Use assumptions about string size and content from `MaskedPan` constructor return MaskedPan{s.substr(0, 6) + "XXXX" + s.substr(s.size() - 4)}; } }; int main() { try { FullPan full = FullPan("1234569988771234"); // This fails to compile because no such converting constructor exists // error: conversion from 'FullPan' to non-scalar type 'MaskedPan' requested // MaskedPan masked = full; MaskedPan masked = full.getMasked(); // Outputting only possible with explicit `str()` - more visible in a code review! // If you're calling `str()` all the time, you're probably misusing strong typing. std::cout &lt;&lt; "Masked (string-based): " &lt;&lt; masked.str() &lt;&lt; std::endl; return 0; } catch (const std::exception&amp; e) { std::cerr &lt;&lt; "Exception: " &lt;&lt; e.what() &lt;&lt; std::endl; } }</code></pre> </div> </div> <div class="paragraph"> <p>Summarizing this solution (one of many):</p> </div> <div class="ulist"> <ul> <li> <p>Everything is explicit:</p> <div class="ulist"> <ul> <li> <p>conversion to raw string value (<code>str()</code>) and thus the ability to output or compare the value</p> </li> <li> <p>conversion to disjoint type (must add constructor or method like <code>getMasked</code>)</p> </li> <li> <p>construction of specific type (use <code>explicit</code> constructors)</p> </li> </ul> </div> </li> <li> <p>Exactly one place for input validation/assertion</p> </li> <li> <p>Can be adapted to other base types as well, not only strings</p> </li> <li> <p>Operators must be defined manually. This <em>can</em> be an advantage, for instance, if the base type (here: string) can be compared for equality/order, but ordering does not make sense for the specific type (here: PAN). <a href=""><code>BOOST_STRONG_TYPEDEF(BaseType, SpecificType)</code></a> is an example implementation which defines operators for you.</p> </li> </ul> </div> <div class="paragraph"> <p>Strong typing wrappers are not&#8201;&#8212;&#8201;and will presumably never be&#8201;&#8212;&#8201;an inherent part of C++. Instead, the above solutions proved simple enough for the mentioned use cases. It&#8217;s on developers to decide whether they write a few lines of wrapper code to be very explicit, or choose a library which does the same thing.</p> </div> </div> <div class="sect2"> <h3 id="_comparison_with_rust">Comparison with Rust</h3> <div class="paragraph"> <p>Rust has the same notion as C++'s aliasing <code>typedef</code>:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">type Num = i32;</code></pre> </div> </div> <div class="paragraph"> <p>which has the same problems, so no need to repeat that topic.</p> </div> <div class="paragraph"> <p>Syntactically, Rust provides a very lightweight way of creating new types in order to achieve strong typing&#8201;&#8212;&#8201;<a href="">tuple structs</a>:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">struct FullPan(String); struct MaskedPan(String); fn main() { let full = FullPan("1234569988771234".to_string()); // Fails to build with // error[E0308]: mismatched types // expected struct `MaskedPan`, found struct `FullPan` // let masked: MaskedPan = full; let masked = MaskedPan("123456XXXX1234".to_string()); println!("Masked (tuple struct): {}", masked.0); // Oops, no input validation: we can pass a full PAN value without getting an error let masked2 = MaskedPan("1234569988771234".to_string()); println!("Masked2 (tuple struct): {}", masked2.0); }</code></pre> </div> </div> <div class="paragraph"> <p>Well, that didn&#8217;t help much&hellip; Admittedly, tuple structs are more helpful for other use cases, such as <code>struct Point(f32, f32)</code> where it&#8217;s clear that X and Y coordinates are meant. A rule of thumb is: if you have to give the tuple fields a name to understand them, or you require input validation at construction time, don&#8217;t use a tuple struct. Remember that Rust uses an error model that is different from throwing exceptions, and in the above example there&#8217;s not even a constructor involved that could return an error (or panic) on invalid input.</p> </div> <div class="paragraph"> <p>Let&#8217;s replicate what we did in C++:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">#[derive(Debug)] struct FormatError { /* ... */ } // Rust doesn't have object orientation i.e. we cannot "derive" from a base type struct MaskedPan { value: String, } impl MaskedPan { pub fn new(value: &amp;str) -&gt; Result&lt;Self, FormatError&gt; { if value.len() != 14 || value[..6].find(|c: char| !c.is_digit(10)).is_some() || value[6..10] != *"XXXX" || value[10..].find(|c: char| !c.is_digit(10)).is_some() { Err(FormatError {}) } else { Ok(MaskedPan { value: value.to_string() }) } } pub fn as_str(&amp;self) -&gt; &amp;str { &amp;self.value } } struct FullPan { value: String, } impl FullPan { pub fn new(value: &amp;str) -&gt; Result&lt;Self, FormatError&gt; { if value.len() &lt; 13 || value.find(|c: char| !c.is_digit(10)).is_some() { Err(FormatError {}) } else { Ok(FullPan { value: value.to_string() }) } } pub fn get_masked(&amp;self) -&gt; MaskedPan { // Since we already checked the `FullPan` value assumptions, we can call // `unwrap` here because, knowing the `MaskedPan` implementation, we can // be sure `new` will not fail. MaskedPan::new(&amp;format!("{}XXXX{}", &amp;self.value[..6], &amp;self.value[self.value.len() - 4..])) .unwrap() } } fn main() { match FullPan::new("1234569988771234") { Ok(full) =&gt; { let masked = full.get_masked(); println!("Masked (string-based): {}", masked.as_str()) } Err(_) =&gt; println!("Invalid full PAN"), } }</code></pre> </div> </div> </div> <div class="sect2"> <h3 id="_should_i_use_strong_typing_everywhere">Should I use strong typing everywhere?</h3> <div class="paragraph"> <p>This questions seems to be mostly language-independent, and a matter of taste to some extent. In my experience, there are ups and downs:</p> </div> <div class="paragraph"> <p>Yay:</p> </div> <div class="ulist"> <ul> <li> <p>Safety from mistakes, especially if they can lead to horrific problems like in the credit card scenario, where full PANs could be leaked to the outside or written to disk if types are confused.</p> </li> <li> <p>Code <em>using</em> the strong types may become more readable (as in: reading English prose) as things get spelled out explicitly</p> </li> <li> <p><a href="">User-defined literals</a> can make code even more concise, but that only applies to code which uses a lot of constants. To be honest, I&#8217;ve never had a project where those literals would be worthwhile.</p> </li> </ul> </div> <div class="paragraph"> <p>Nay:</p> </div> <div class="ulist"> <ul> <li> <p>Much extra typing and explicit definition of operators/actions</p> </li> <li> <p>Avoid using strings all over the place and you will have fewer problems from the start. For example, there&#8217;s <code>boost::filesystem::path</code>.</p> </li> <li> <p>No real benefit for structures which probably never change and have well-named fields. To prevent mistakes in the order of constructor arguments, use POD structs and <a href="">C++ designated initialization (syntax extension)</a>. Rust also has such a syntax, and additionally gives builds errors if you forgot to initialize a field. The builder pattern is a similar alternative (however not really beautiful). Stupid example:</p> </li> </ul> </div> <div class="listingblock dog-blog-code-indented"> <div class="content"> <pre class="highlight"><code class="language-cpp" data-lang="cpp">// C++ struct CarAttribs { float maxSpeedKmh; // kilometers per hour float powerHp; // horsepower }; class Car { public: explicit Car(const CarAttribs&amp; a) { /* ... */ } }; int main() { auto car = Car{{.maxSpeedKmh = 220, .powerHp = 180}}; // Unfortunately that syntax doesn't prevent unspecified fields (no compiler warning) auto car2 = Car{{.maxSpeedKmh = 220}}; }</code></pre> </div> </div> <div class="listingblock dog-blog-code-indented"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">// Rust struct CarAttribs { max_speed_kmh: f32, // kilometers per hour power_hp: f32, // horsepower } struct Car { /* ... */ } impl Car { fn new(attribs: &amp;CarAttribs) -&gt; Self { Car{ /* ... */ } } } fn main() { let car = Car::new(&amp;CarAttribs { max_speed_kmh: 220.0, power_hp: 180.0, }); // This fails to build with // error[E0063]: missing field `power_hp` in initializer of `CarAttribs` // let car2 = Car::new(&amp;CarAttribs { max_speed_kmh: 220.0 }); }</code></pre> </div> </div> <div class="paragraph"> <p>In the end, you must decide per case. Often times, the declaration of functions or types allows for human errors, so before changing to strong typing, you should first consider if the order of parameters, name of fields, choice of constructor(s), et cetera are sane, consistent in their meaning (money amount shouldn&#8217;t be <code>123</code> cents in one place, but decimal number string <code>1.23</code> elsewhere) and follow the principle of least surprise and smallest risk of mistakes.</p> </div> <div class="paragraph"> <p>There&#8217;s also no clear winner between the languages&#8201;&#8212;&#8201;since strong typing is not a built-in feature in either language, you must roll your own or use a library, and that isn&#8217;t exactly elegant, but still readable.</p> </div> </div> </div> </div> Sun, 28 Oct 2016 00:00:00 -0000 Today I learned — episode 2 (hacking on Rust language) <div id="preamble"> <div class="sectionbody"> <div class="paragraph dog-blog-hidden-in-overview"> <p>This blog series (in short: <em>TIL</em>) is supposed to cover topics in software development, learnings from working in software companies, tooling, but also private matters (family, baby, hobbys).</p> </div> </div> </div> <div class="sect1"> <h2 id="_hacking_on_the_rust_language_error_messages">Hacking on the Rust language&#8201;&#8212;&#8201;error messages</h2> <div class="sectionbody"> <div class="paragraph"> <p>Right now I&#8217;m totally digging <a href="">Rust</a>, a modern systems programming language which covers for instance thread-safety and memory access checks at compile time to guarantee code safety&#8201;&#8212;&#8201;while still being close to hardware like C/C++&#8201;&#8212;&#8201;and has many more benefits such as a well-designed standard library, fast paced community and release cycle, etc.</p> </div> <div class="paragraph"> <p>Since I&#8217;m professionally working in C++, I am currently drafting a presentation that compares C++ with Rust with the goal of finding out where C++ could improve. The plan is to first present the slides in the <a href="">Munich C++ meetup</a> when completed.</p> </div> <div class="paragraph"> <p>One topic where C++ lags behind are macros&#8201;&#8212;&#8201;in Rust (<a href="">macro documentation</a>), one can match language elements, instead of doing direct text preprocessing (pre = before the compiler even parses the code).</p> </div> <div class="listingblock dog-blog-breakpoint"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">// Simple macro to autogenerate an enum-to-int function macro_rules! enum_mapper { // `ident` means identifier, `expr` is an expression. `,*` is comma-separated repetition (optional trailing comma) ( $enum_name:ident, $( ($enum_variant:ident, $int_value:expr) ),* ) =&gt; { impl $enum_name { fn to_int(&amp;self) -&gt; i32 { match *self { // I can put comments in a macro, and don't need to have backslashes everywhere! $( $enum_name::$enum_variant =&gt; $int_value ),* // repetition consumes matches in lockstep } } } }; } // Totally stupid enum #[derive(Debug)] enum State { Succeeded, Failed(String), // error message Timeout, } enum_mapper!( State, (Succeeded, 1), (Failed, 2), (Timeout, 3) ); fn main() { let st = State::Failed("myerror".to_string()); println!("{:?} maps to int {}", st, st.to_int()); }</code></pre> </div> </div> <div class="paragraph"> <p>(<a href="">play with this code</a>)</p> </div> <div class="paragraph"> <p>That code snippet produces the error <code>error[E0532]: expected unit struct/variant or constant, found tuple variant <code>State::Failed</code></code> with the nightly compiler. To me, reading such a verbose error was like learning C++ in my childhood&#8201;&#8212;&#8201;I just had no idea of the terminology used with the language, so "unit struct/variant" and "tuple variant" were totally unclear to me and not immediately intuitive. The displayed error location also wasn&#8217;t helpful, and neither provided me the expanded macro, nor the failing code line. In this sense, the error messages are on par with the C++ preprocessor (just as bad πŸ˜‚). Normally, Rust provides error explanations with examples, displayed by <code>rustc --explain E0532</code>. But in this case: <code>error: no extended information for E0532</code>.</p> </div> <div class="paragraph"> <p>So I found out myself&#8201;&#8212;&#8201;removing the variant parameter (<code>String</code>) from <code>State::Failed(String)</code> (so the enum only has simple variants), my macro was working fine, and after some thinking it was clear that I had previously commented out the consideration of variant parameters (that&#8217;s how I call them at the moment). Here&#8217;s how I could match <code>State::Failed(String)</code>:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code class="language-rust" data-lang="rust">$enum_name::$enum_variant(..) =&gt; $int_value</code></pre> </div> </div> <div class="paragraph"> <p>Note that this is not a solution because now it won&#8217;t match <code>State::Succeeded</code> and <code>State::Timeout</code> anymore (maybe it <a href="">used to work</a> earlier), but this article is more about getting to understand the problem by the error message.</p> </div> <div class="paragraph"> <p>Having found the problem, I still didn&#8217;t feel happy because that debug session cost me time and might happen again for me and surely for others as well. Hence, let&#8217;s hack Rust!</p> </div> <div class="paragraph"> <p>Getting started with <a href="">hacking Rust</a> is elegantly simple: clone, <code>./configure</code>, <code>make</code>. That will build the whole compiler and rustdoc toolchain, but not the cargo build tool, but that&#8217;s already enough hard disk space and download/build time spent. On my slow connection, the <code>configure</code> script was still cloning LLVM after 15 minutes πŸ’πŸ’¨&#8230;&#8203;</p> </div> <div class="paragraph"> <p><code>make tips</code> hints at different targets to limit what is built, like if you&#8217;re only working on the compiler (&#8594; <code>make rustc-stage1</code>, <code>make &lt;target-triple&gt;/stage1/bin/rustc</code>).</p> </div> <div class="paragraph"> <p>In the meantime I searched the code for existing error messages (just grepped for the one of E0003), and immediately found the source file <code>src/librustc_const_eval/</code>. I found it strange that the list of diagnostic error messages was so short, so I did <code>ag -l '^\s*E[0-9]{4}:'</code> and discovered that the error messages belong to the respective crate. In case of the example error I grepped (E0003), it&#8217;s the crate for "constant evaluation on the HIR and code to validate patterns/matches" (<em>HIR</em> stands for <em>High-level Intermediate Representation</em>). My desired error explanation should therefore go into the right crate, which turned out to be <code>librustc_resolve</code>.</p> </div> <div class="paragraph"> <p>Finally the compiler build completed, but to my surprise, <code>x86_64-apple-darwin/stage1/bin/rustc --explain E0003</code> could not find an explanation. That was peculiar as stage 1 already should give me a working compiler (as of the writeup in <code>make tips</code> and the great summary <a href="">Contributing to the Rust compiler</a>). The solution of the riddle was easy: E0003 has vanished with the following commit:</p> </div> <div class="listingblock"> <div class="content"> <pre class="highlight"><code>commit 76fb7d90ecde3659021341779fea598a6daab013 Author: Ariel Ben-Yehuda &lt;; Date: Mon Oct 3 21:39:21 2016 +0300 remove StaticInliner and NaN checking NaN checking was a lint for a deprecated feature. It can go away.</code></pre> </div> </div> <div class="paragraph"> <p>Using another error code, it displays the explanation just fine, e.g. <code>x86_64-apple-darwin/stage1/bin/rustc --explain E0004</code>. Only missing point was to get in my desired explanation and example for E0532, and test it in the same way. This part is too detailed for a blog post, but I ultimately ended up with a <a href="">pull request</a> (still pending at the time of writing).</p> </div> <div class="paragraph"> <p>Now I&#8217;m happy to have started my first contribution. There will surely be more blog posts following about my experiences with Rust!</p> </div> <div class="paragraph"> <p>P.S. Only later I found that the stable rustc 1.12.1 would&#8217;ve given a slightly better error for the initial problem (<code><code>State::Failed</code> does not name a unit variant, unit struct or a constant</code>). Remember you can play around with Rust versions <a href="">online</a> or with <a href="">rustup</a>.</p> </div> </div> </div> Sun, 22 Oct 2016 00:00:00 -0000 Today I learned — episode 1 (introduction to blog series, Ansible) <div id="preamble"> <div class="sectionbody"> <div class="paragraph"> <p>This new blog series (in short: <em>TIL</em>) is supposed to cover topics in software development, learnings from working in software companies, tooling, but also private matters (family, baby, hobbys). No idea where I&#8217;m heading with it, though πŸ˜‰</p> </div> <div class="paragraph"> <p>I would like to start with the reason for creating <em>TIL</em>.</p> </div> </div> </div> <div class="sect1"> <h2 id="_nginx_replaces_apache_introducing_ansible">nginx replaces Apache, introducing Ansible</h2> <div class="sectionbody"> <div class="paragraph"> <p>The setup on my server got too complicated, especially with Apache configs that I&#8217;ve been maintaining since Apache 2.0.x was installed. From experience at my company, I know that nginx is much easier and concise in its configuration.</p> </div> <div class="paragraph dog-blog-breakpoint"> <p>Around 2011, I created <a href="">Site Deploy</a>, a UI for deploying web sites to web servers via SSH. The set of supported web site types was limited (the ones I used, e.g. Django, Play!, static files), as were the supported servers (Apache, nginx, lighttpd).</p> </div> <div class="paragraph"> <p>Only in 2015, while working on an automated test environment setup for my company, I learned about <a href="">Ansible</a>, and that my software was basically the same invention to a small extent, just earlier and created for private purposes. <em>Site Deploy</em> has similar concepts like hosts, variable resolution, SSH connection, but never became as elegant as Ansible.</p> </div> <div class="paragraph"> <p>My previous setup was manual configuration of virtual hosts of Apache, plus configuration files automatically created by <em>Site Deploy</em> per site. This was all replaced by an Ansible-based setup, which makes it easier to</p> </div> <div class="ulist"> <ul> <li> <p>Run tasks and updates from one place</p> </li> <li> <p>Create a test environment. I&#8217;m using a VM with the same OS as my real server, plus a suffix <code>.testdomain</code> so that I only have to add e.g. <code>&lt;IP of test VM&gt;</code> to <code>/etc/hosts</code>.</p> </li> <li> <p>Test repeatability (and idempotency) of the tasks. I tried <a href="">checked mode</a>, which only performs a dry run, but it&#8217;s much harder to maintain playbooks that are compatible with it. Having a test machine (or VM) is a much better idea. Seeing things fail (even if on a test system) makes you a better developer / devop / sysadmin.</p> </li> <li> <p>Get the same setup every time, therefore having a full configuration backup in one place, without the need to backup anything from the server (except for live data like databases).</p> </li> <li> <p>Reuse, reuse, reuse. <a href="">Ansible roles</a> are one way to apply the same changes multiple times, e.g. creating a web site configuration. Another way is to use nginx/Apache&#8217;s include operation to add common configuration directives to other configs.</p> </li> <li> <p>Readability and ability to share playbooks/roles with others.</p> </li> </ul> </div> <div class="paragraph"> <p>Ansible is definitely the way to go for me. It&#8217;s available as package on *nix systems (not on Windows) and principally only needs Python installed on the to-be-configured server. It works by syncing modules to the server and running them there, with the inputs you provide. Read their <a href="">introduction article</a> to get a grasp of the concepts - the learning curve is gentle and you will quickly get good results.</p> </div> <div class="paragraph"> <p>Other than my server, I am also maintaining my home server - a <a href="">Raspberry Pi</a> with a hard drive, exposed through dynamic DNS on the Internet so that I can, for example, access my music from work. At least that&#8217;s the theory. In reality, my neighborhood only gets DSL 6000 speed with horrible upload rate which makes my remote MP3 listening experience very hickupy&hellip; Back to topic: I&#8217;ve even used Ansible to setup/restore my work laptop before.</p> </div> <div class="paragraph"> <p><strong>Conclusion:</strong> <a href="">Ansible</a> is the simplest, most readable and, in my opinion, architecturally best (e.g. no server-side component, only Python required) way to set up your private server or other machine. Learning it should be really quick, while mastering takes the usual time. Even if the tool is one of the industry standards among its rivals Chef, Puppet, etc., it&#8217;s hard to find (consistent) best practices. The sharing portal <a href="">Ansible Galaxy</a> has playbooks/roles of mixed quality, which I haven&#8217;t checked out very much, and software maintainers typically don&#8217;t provide direct Ansible support in their upstream project (for instance, wouldn&#8217;t that be nice for <em>nginx</em>?!). However there are books on it already (I&#8217;ve read a bit of an <a href="">Ansible: Up and Running</a> excerpt a while ago, looks promising), and meetup groups around the world (for me: <a href="">Munich group</a>). The company behind the open source project also seems to be quite good at communicating and documenting everything, while keeping a good balance between making money and maintaining the open source part. Their developers and evangelists whom I&#8217;ve met on the meetups ranged from very competent to brilliant, so the future looks bright on this project.</p> </div> </div> </div>