A Single Repo for Statically Generated Sites on Github Pages

There’s a lot of talk about how easy it is to setup a statically-generated blog (such as this!) using a static site generator and hosting it on Github pages.

In this post, we’ll take a look at a simple git workflow that can be used to manage a statically-generated site, such as this.

Github pages usually publish the content from the master branch of a git repo, and while some tips found online might suggest keeping the generated content and the source material in separate repos, it’s easier to have them all in one project that can be managed all at the same time.

For the rest of this post, we’ll use the examplepage repo as an example. and assume we’ll be using the Hugo blog generator.

Basic Setup

We’ll first setup a new repo on Github to host the site. In this example, we’re using https://github.com/ruiwen/examplepage.github.io.

We’ll also need to ensure that we have Hugo installed.

With that done, we can begin to setup our project directory. hugo provides a quick way to get started with that

$ hugo new site -f yaml examplepage.github.io
Congratulations! Your new Hugo site is created in /home/ruiwen/projects/examplepage.github.io.

Just a few more steps and you're ready to go:

1. Download a theme into the same-named folder.
   Choose a theme from https://themes.gohugo.io/, or
   create your own with the "hugo new theme <THEMENAME>" command.
2. Perhaps you want to add some content. You can add single files
   with "hugo new <SECTIONNAME>/<FILENAME>.<FORMAT>".
3. Start the built-in live server via "hugo server".

Visit https://gohugo.io/ for quickstart guide and full documentation.

$ ls examplepage.github.io/
archetypes  config.yaml  content  data  layouts  static  themes

We use the -f yaml option to specify that we’d like to use YAML as a configuration format. hugo defaults to TOML by default.

To get started, you’ll need to configure hugo, setting values for the baseURL, and title of the site, but that’s beyond the scope of this blog post.

For now, let’s also create a simple blog post.

$ hugo new post/a-sample-post.md
/home/ruiwen/projects/examplepage.github.io/content/post/a-sample-post.md created

You’ll notice that new content is rooted at the content/ directory, and all we need to do is to specify the path under that, eg. post/a-sample-post.md.

We’ll also add some content to the new post.

$ echo "Here's a sample post! Welcome to our sample page!" >> content/post/a-sample-post.md
$ cat content/posts/a-sample-post.md
title: "A Sample Post"
date: 2018-05-29T17:57:14+08:00
draft: true

Here's a sample post! Welcome to our sample page!

Before we publish the post, we’ll need to configure a theme for the site. Check out themes.gohugo.io for themes. In our example, we’re using hyde-hyde.

$ pwd
$ git clone https://github.com/htr3n/hyde-hyde.git themes/hyde-hyde
Cloning into './hyde-hyde'...
remote: Counting objects: 597, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 597 (delta 0), reused 2 (delta 0), pack-reused 592
Receiving objects: 100% (597/597), 1.89 MiB | 809.00 KiB/s, done.
Resolving deltas: 100% (310/310), done.
Checking connectivity... done.

Next, edit config.yaml at the root of the project, and include theme: hyde-hyde at the bottom.

At this point, we’re almost ready to publish the blog! We just need to remove the draft: true line in the post, and we’re good to go.

To see what your site would look like, run the dev server, and navigate to http://localhost:1313

$ hugo server

Committing to git

Next up, we’ll want to set up our git repo.

$ git init
Initialized empty Git repository in /home/ruiwen/projects/examplepage.github.io/.git/
$ git remote add github https://github.com/ruiwen/examplepage.github.io

First, we want to create an initial empty commit. This will be our repo’s starting point, and allow us to manage two different aspects of the site in the same repo.

$ git commit --allow-empty -m "Initial empty commit"
[master (root-commit) d9c0248] Initial empty commit

We’ll want to create a new branch, that is not master. For this example, we use source

$ git checkout -b source
Switched to a new branch 'source'

Then we add all the files

$ git add .
$ git commit -m "New post: A sample post"
[source 2f14616] New post: A sample post
4 files changed, 21 insertions(+)
create mode 100644 archetypes/default.md
create mode 100644 config.yaml
create mode 100644 content/post/a-sample-post.md
create mode 160000 themes/hyde-hyde
$ git log --oneline --graph --decorate --all
* 2f14616 New post: A sample post  (HEAD -> source) [Ruiwen Chua 1 second ago]
* d9c0248 Initial empty commit  (master) [Ruiwen Chua 60 seconds ago]

We see that the source files have been committed to the source branch.

Now, because Github pages deploys content from the master branch, we want a) the fully rendered static files to reside in the master branch, and b) none of the source files to reside in the master branch. Hugo renders to public/ by default, so we’ll want make sure that our public/ directory represents our master branch.

Setting public/ as our master branch

We’ll first make public/ the home of our master branch. git has a wonderful tool that allows us to checkout a branch into its own directory that resides right alongside the rest of the repo. We’ll use git worktree to make this happen.

$ git worktree add -b master public
Preparing public (identifier public)
HEAD is now at d9c0248 Initial empty commit

The interesting thing about git worktree is that it allows us to check out a completely different branch of the repo and have it checked out alongside any other branch we happen to be working on.

$ cd public
$ ls
$ git branch
* master

In the public/ directory, we don’t see any files in the listing, because the master branch is currently pointing at our original empty commit. However, we do see that git recognises that we are, in fact, in the master branch, and not the source branch, where we were previously.

Now that public/ represents our master branch, we’re going to get hugo to render our site into it.

$ hugo
                   | EN
  Pages            |  7
  Paginator pages  |  0
  Non-page files   |  0
  Static files     |  8
  Processed images |  0
  Aliases          |  0
  Sitemaps         |  1
  Cleaned          |  0

Total in 239 ms
$ ls public/
404.html  apple-touch-icon-144-precomposed.png  categories  css  favicon.png  img  index.html  index.xml  sitemap.xml  tags
$ cd public
$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)


nothing added to commit but untracked files present (use "git add" to track)

We’ll just commit the files in public/, and they should be added to our master branch

$ git add .
$ git ci -m "Publish: 20180529"
[master ff17216] Publish: 20180529
 16 files changed, 1288 insertions(+)
 create mode 100644 404.html
 create mode 100644 apple-touch-icon-144-precomposed.png
 create mode 100644 categories/index.html
 create mode 100644 categories/index.xml
 create mode 100644 css/custom.css
 create mode 100644 css/hyde.css
 create mode 100644 css/poole.css
 create mode 100644 css/print.css
 create mode 100644 css/syntax.css
 create mode 100644 favicon.png
 create mode 100644 img/hugo.png
 create mode 100644 index.html
 create mode 100644 index.xml
 create mode 100644 sitemap.xml
 create mode 100644 tags/index.html
 create mode 100644 tags/index.xml
$ git lg
* ff17216 Publish: 20180529  (HEAD -> master) [Ruiwen Chua 2 seconds ago]
| * 2f14616 New post: A sample post  (source) [Ruiwen Chua 54 seconds ago]
* d9c0248 Initial empty commit  [Ruiwen Chua 2 minutes ago]

Now we see that the rendered site under public/ has been committed to the master branch, while the source material still remain on the source branch. Both branches branch from the initial empty commit we made, and both branches can exist simultaneously in the same working environment, thanks to git worktree. This suits hugo’s publishing workflow pretty well, allowing us to render straight into our master branch for committing.

Monitoring of a Kubernetes Cluster from the Outside with Prometheus

When setting up a Kubernetes cluster, a Prometheus and Grafana setup is a great way to have to start monitoring your cluster health. From CPU and RAM stats, to filesystem usage, to even the number and type of requests that your cluster is serving. One thing however, is that most setups tend to assume that you’ll be deploying Prometheus within the cluster itself. To be sure, this is probably the easier way of settings things up. Within the cluster, your Prometheus set up will have no issues finding and monitoring the configured pods, endpoints and nodes.

The Problem with an out-cluster Prometheus setup

That said, sometimes, it’s not entirely feasible to deploy Prometheus within a particular cluster — for example, if you wanted Prometheus to monitor mulitple Kubernetes clusters across multiple service providers like AWS and Azure.

In this scenario, I’ve found it more convenient to host Prometheus separately, outside the Kubernetes clusters, then set it up to monitor the clusters and their services and pods.

However, I ran into yet another problem, which was properly exposing the services to have them scaped by Prometheus — particularly in the case of Deployments that spanned more than one Pod. While I could potentially include sidecar containers in each Pod exposing /metrics on a given port, I struggled to find a way to properly expose them to an out-cluster Prometheus. I could expose an Ingress and have Prometheus access the endpoint through the cluster’s main IP, but given as how Kubernetes Services deliver requests to their backing Pods in a round-robin fashion, this meant that each successive scrape could end up hitting a different pod. This would lead to confusing metrics to say the least. Ideally, what we would want is for Prometheus to be able to distinguish between the metrics scraped from each individual pod (with a different label for each pod), that way, we could tell if, say, one Pod ended up serving more traffic than the others in the Deployment.

One option to address each Pod individually would perhaps be to expose one Ingress per Pod. Of course, this would have to be automated in some form, perhaps having yet another Service watch new pods being spun up and creating the necessary exporters and Ingresses automatically, but this approach quickly gets very unwieldy, without even considering what happens once we start scaling in any form.

So now we have two problems. We want to

  • have Prometheus be able to access the metrics endpoints for all necessary pods, without the need to have to define them beforehand,
  • and do so from outside the cluster, without having to manually (or otherwise) open up Ingresses

Abusing the API Server

As it turns out, the APIServer does in fact allow us to communicate directly with the Pods, without necessarily creating an Ingress, or a Service beforehand. All that’s needed is the proper credentials and authorisation to make HTTP requests through the APIServer.

It’s not terribly obvious from the outset, but even the very familiar kubectl does in fact communicate with the APIServer (and manages the cluster) via simple HTTP calls. (Run any kubectl command with the -v 10 option to see the HTTP calls that are being made in the background — eg. kubectl -v 10 version)

Access Control

In order to communicate with the APIServer through its API though, we’ll first need to set up some form of access control.

If you’re on Kubernetes < 1.6, you’ll have to use Attributed-based Access Control (ABAC), and if you’re running Kubernetes > 1.6, you’ll be able to use the more convenient Role-based Access Control (RBAC). A discussion on how to effectively use ABAC vs RBAC is beyond the scope of this post, but essentially, you’ll want to end up with an access token (eg. a ServiceAccount’s Secret token value) that will allow you to make authenticated and authorised requests to the APIServer.

If you’d just like to try it out, you could run kubectl -v 10 version, watch the HTTP calls, and simply use the values kubectl is sending in the Authorization HTTP header. For production setups however, I’d recommend setting up a proper ServiceAccount with appropriately scoped permissions.

Accessing Pods through the APIServer

It’s not commonly mentioned in the general Kubernetes documentation, but APIServer does allow you to make rquests directly to the pods within the cluster.

It's hardly clear what you're supposed to do with this, and the lack of documented examples don't help either

Get Connect Proxy Path

It's hardly clear what you're supposed to do with this, and the lack of documented examples don't help either

However, with the handy documentation on this page, we can make HTTP calls directly to each pod, through the Kubernetes API server, without needing to create a specific Ingress for each Pod that we’d like to have Prometheus scrape. It follows that this means that we can afford to then expose metrics pages only on cluster-local IP addresses, without worry of those pages leaking out to the public Internet.

From the Kubernetes API documentation, we can refer to the sections on Proxy operations for the various Kubernetes objects. For example, the Pod proxy operations show us how to reach out to a specific Pod through the Kubernetes API.

Assuming we have a Prometheus exporter pod, prom-exporter in the namespace monitoring, that exposes metrics at the path /metrics, that we’d like to scrape.

The general pattern of the request looks like

GET /api/v1/proxy/namespaces/{namespace}/pods/{name}/{path}

We can make a request to the Pod via the call below

GET /api/v1/proxy/namespaces/monitoring/pods/prom-exporter/metrics
# As a curl command, it should look something like
# $ curl "https://<api_server:port>/api/v1/proxy/namespaces/monitoring/pods/prom-exporter/metrics

which should give us our exported metrics.


Naturally, when you’re setting up Prometheus to perform scraping through the proxy API in this manner, you’ll want to be connecting over HTTPS to ensure that your metrics are not leaked to third parties over the wire. However though, since Kubernetes APIServer SSL certs are usually self signed, you’ll also want to include your APIServer’s CA certificate in your Prometheus configuration so that you can authenticate the server.

In your prometheus.yml,

    - job_name: 'natsd'
      scheme: https
      bearer_token: "$KUBERNETES_TOKEN"
        ca_file: /etc/prometheus/tls/certs/kubernetes.crt

      - api_server: 'host.kubernetes.apiserver.com'
        bearer_token: "$KUBERNETES_TOKEN"
          ca_file: /etc/prometheus/tls/certs/kubernetes.crt
        role: pod

Extracting the CA certificate from the APIServer is a matter of running (assuming the APIServer is running at,

$ openssl s_client -connect < /dev/null | openssl x509 -text
<... truncated ...>
    Signature Algorithm: sha256WithRSAEncryption

The important bit is between the BEGIN CERTIFICATE and END CERTIFICATE lines, inclusive. Save that to a file named ca.crt (for example), and include it in your prometheus.yml.

Putting it together

Eventually, our configuration ends up looking something like this

    - job_name: 'natsd'
      scheme: https
      bearer_token: "$KUBERNETES_TOKEN"
        ca_file: /etc/prometheus/tls/certs/kubernetes.crt

      - api_server: 'host.kubernetes.apiserver.com'
        bearer_token: "$KUBERNETES_TOKEN"
          ca_file: /etc/prometheus/tls/certs/kubernetes.crt
        role: pod

      # Tells Prometheus to query the APIServer for all pods matching the target label (natsd-*) below
      # and for each of the pods, generate a scrape target at the `/metrics` path via the proxy API
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name, __meta_kubernetes_pod_container_port_name]
        action: keep
        regex: default;natsd-.*;metrics  # Remember to use the right 'container_port_name` as specified in the Deployment
      - target_label: __address__
        replacement: 'host.kubernetes.apiserver.com'  # API server address
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name,__meta_kubernetes_pod_container_port_number]
        regex: (.+);(.+);(.+)
        target_label: __metrics_path__
        replacement: /api/v1/namespaces/${1}/pods/http:${2}:${3}/proxy/metrics  # Path after /proxy/ is metrics path of pod
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(service|tier|type)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: pod_name

With this, we can expose metrics on each individual natsd-* pod within the Kubernetes cluster, without needing to worry about setting up an Ingress nor a Service for the sole purpose of allowing an off-cluster Prometheus setup to access said metrics.

Kubernetes for small Organisations

Kubernetes for small Organisations: Is it worth it?

I recently gave a talk at GeekcampSG 2016 about our ongoing experience with Kubernetes in Lomotif.

It wasn’t a technical deep-dive, so I didn’t talk very much about the implementation itself, but more about the experience in general, and the pitfalls that we came across. Hopefully, if you’re planning a move to Kubernetes, this talk could give you a heads up of a couple of gotchas that you’ll probably want to pay attention to before jumping in.

Also great thanks and a shout-out to the wonderful folk from Engineers.SG who recorded and uploaded the talks from GeekcampSG! =)

Ubuntu 16.04 on Raspberry Pi 3

I recently wanted to unify my dev environments, including my (very handy!) Raspberry Pi 3.

I’d gotten started with it with a Raspbian Jessie, but felt the new Ubuntu 16.04 was really more my speed. Thus began the journey to upgrade Pi OSes


First up, finding the right image. The good thing about the Pi community is that there’s so much activity, and so many community images. The bad thing is, well, that there are so many community images that it’s sometimes hard to tell which is going to work.

Let me save you the trouble — this one does

It’s a link to a .torrent file from Ubuntu Pi Flavour Maker. You’ll end up with a .img.xz file which you’ll have to unpack (depending on your OS) to get the original .img image.

After which, it’s a quick write to the prepared SD Card (I suggest formatting the card via your OS’s disk management tools, so you’ll want to back anything important up) with something similar to

sudo dd bs=1m if=path_of_your_image.img of=/dev/diskn

(The official raspberrypi.org has handy documentation on how you can write a Raspberry Pi .img to your SD card.)

Booting Up

Once you’re done writing the image to your SD card, pop it into your Pi, connect the peripherals, plug in the power, and… wait.

At this stage, unless you have a configured ethernet connection all set up for your Pi and are confident of SSH’ing in, you’ll probably want a monitor and keyboard attached as well so you can work directly on the console after it boots.

If you don’t have a ready ethernet connection, keep in mind that there’s a script that waits for an address to be assigned to eth0 via DHCP, and times out only after 5 mins. If your Pi isn’t responsive after plugging the power in, grab a coffee or something.

After a while, your Pi should be running, and if you’ve got a monitor connected to it, you should have a login prompt ready to go.

Just in case — the default username and password for RPi Ubuntu images is usually: ubuntu / ubuntu

Connecting to Wifi

The next thing you’ll want to do is probably to connect your Pi to your wifi network. If you don’t, and already have an ethernet connection good to go, then feel free to skip the rest of this section.

Wireless connections on the RPi are handled by a package called wpasupplicant. It’s not installed by default, so you’ll need to grab it with a familiar apt-get install. This is also why you need an ethernet connection to at least get started.

Pro-tip: If your router/switch is too far away from where you have the rest of your RPi set up, bridging your desktop/laptop’s wifi connection over an adapter to an ethernet connection is something I found really handy.

You’ll also notice that if you entered ifconfig at the prompt, the wireless interface wlan0 (by default) has not automatically come up on boot, so we’ll need to configure it to do so.

Find and edit the file /etc/network/interfaces

$ sudo vim /etc/network/interfaces

Then add a block to configure the wlan0 interface with the requisite wifi credentials

auto wlan0
allow-hotplug wlan0
iface wlan0 inet dhcp
wpa-ssid "YourWifiSSIDHere"
wpa-psk "y0uRs3cretw1f1p@55w0rdHere"

Save the file, and you should be able to restart the networking service with sudo service networking restart and have the wlan0 interface come up.

wlan0     Link encap:Ethernet  HWaddr b8:27:eb:49:2d:dc
          inet addr:  Bcast:  Mask:
          inet6 addr: fe80::bf27:ebdf:fd41:369c/64 Scope:Link
          RX packets:165550 errors:0 dropped:4 overruns:0 frame:0
          TX packets:125635 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:131148832 (131.1 MB)  TX bytes:26205017 (26.2 MB)

Ensure that the interface is properly connected by looking out for the inet addr: line. This means that the RPi is connected and has successfully obtained an IP address on that interface.

To be extra certain, you might want to disconnect the ethernet connection at this stage and reboot the RPi and see if it comes up. Remember the IP address that was issued to wlan0 here, and use that to try to reconnect to the RPi after it reboots. Most routers will issue the same IP via DHCP to the same host if there is no contention, and on a home/small office network, this usually isn’t a problem. This Ubuntu image comes with OpenSSH installed and running, so you’ll just need to attempt to connect via SSH once it boots. Keep in mind that even though the ethernet cable has been disconnected, the script that waits for the ethernet interface to come up still times out at 5 minutes, so you have at least that long to wait.

With that running, you’re good to go!

Resizing the partition

Once your network is up and running rejoice! Until you run a df -h or similar and realise that your partition seems a lot smaller that you thought it’d be.

I’m running my Pi off a 64GB microSD card, but when I first looked, df was showing me only a 3GB drive thereabouts. Talk about a shock!

A quick google revealed that the Ubuntu image didn’t eat up the rest of my 60GB though. Turns out that RPi image is based off of a 4GB disk image, and when I dded that over to my sdcard, it just wrote the exact same partition table as well.

Not to fear though, the Ubuntu MATE guys have a good writeup on how to resize your RPi’s partitions (Search for “Re-size file system”)

In summary though, when you need to do is to delete the smaller ~4GB partition, and rewrite a larger version back to the partition table.

The mounted sdcard is recognised in the RPi usually as /dev/mmcblk0, with the /boot partition on /dev/mmcblk0p1 and /dev/mmcblk0p2. We’ll need to use fdisk to delete the partition on /dev/mmcblk0p2, and rewrite it back as a larger one.

$ sudo fdisk /dev/mmcblk0

This gets you into fdisk, where you can then print the partition table, and delete partition 2.

Then, create a new primary partition, accepting the defaults fdisk offers, which would be to use all the remaining space on the sdcard. Remember to write the changes to disk before quitting fdisk, otherwise your changes would not take effect.

Once that’s done, reboot the system.

You’ll notice if you run df -h again once the Pi has booted, that it still says your data is sitting on a smaller ~4GB partition. What gives?

Turns out that even though your partition has been resized, the filesystem itself needs to be made aware of its comfy new environment and spread out accordingly.

With a simple

$ resize2fs /dev/mmcblk0p2

the filesystem is resized, and df -h should finally show you the values you’ve been expecting all along.


Once all this is done, your RPi should be up and ready to use! Happy hacking!