Pages the fun way

Most people use GitHub Pages, some use GitLab Pages, and then there's me - using GitLab Pages, but....the fun way.

Terror

Last time, we did DevOps the fun way. And at the end, I ominously said “GitLab Pages is next.” I wasn’t bluffing. In this article, I’ll explain why I just moved this website off of GitHub Pages. We’ll also delve into how fucking horrifyingly cursed it was to set this all up.

Oh, yeah, by the way, this website did get moved. How can you tell? It no longer links to GitHub. Oh, and you’re reading this.

What’s wrong with GGitHub Pages?

Nothing. Really. I have nothing against it. It’s free, easy to set up, does what I need it to do for this site, and it works fine.

The problem is that GitHub Pages is not something I run. This means I’m essentially freeloading off of their infrastructure, and keep in mind - GitHub Pages is free out of courtesy for open-source users. I think that’s wonderful, but it’s important to remember that these things can very quickly change out of nowhere. On the extremely-off-chance GitHub Pages suddenly becomes a paid hosting service, I’d at least like to know how to run an equivalent of it myself.

TL;DR: There’s nothing wrong with GitHub Pages, I just want to learn how these things work and…if I have to pay for static site hosting, I want to run it myself.

How my GitLab works

As a refresher from the last post, here’s how GitLab itself is hosted.

I took an old gaming computer. I loaded Ubuntu Server on it, set up GitLab EE on it, and joined it via Wireguard to a cheap Ubuntu VPS. I then pointed gitlab.acidiclight.dev at the VPS, and set up an nginx reverse proxy to expose GitLab to the Internet.

After the last article, this slightly changed. The TL;DR is cr.acidiclight.dev is now a thing that exists, and it’s a Docker container registry. But it goes through the same reverse proxy, it’s set up the same.

How GitHub Pages works

First, keep in mind that GitHub Pages and GitLab Pages (as provided by GitLab.com) work the same from a high level.

When you create a GitHub Pages repo, you get assigned a github.io domain name by default. For example, when the Socially Distant website was hosted on there, it was given sociallydistantgame.github.io.

When you visit a github.io domain, and it actually has a website on it, you get full HTTPS support.

You also have the ability to add a custom domain, which is what I did with sociallydistantgame.com in this example. You simply add a CNAME for it in your DNS provider, and tell GitHub about it, they do some verification, and eventually your GitHub Pages site has a custom domain name on it. They also use Let’s Encrypt to issue a certificate on behalf of your domain, so after a while, you get full HTTPS support.

The goal

You just read it. I want to have things set up such that I can get assigned a unique GitLab Pages domain, that has full HTTPS support, while also having the ability to point a custom domain at it and eventually get a Let’s Encrypt cert. And if you’re reading this text at this very moment, that’s exactly how it’s set up now. Which means it worked.

Part 1: Setting up DNS and networking

Usually, when I need to expose something running at home to the Internet, I do so using an NGINX reverse proxy on my VPS. This allows me to let you guys have access to things like my GitLab, without you needing to ever know my home IP address.

This is not possible with GitLab Pages when using custom domain names alongside auto-assigned domains like GitLab.com does. We must go deeper.

Proxying Things a Different Way

There are two kinds of reverse-proxying that I’m able to do with nginx.

  • TLS termination: this is where the reverse proxy handles HTTPS, and forwards requests to an upstream HTTP server, over HTTP. This is what I traditionally use for gitlab.acidiclight.dev.
  • Stream proxying: this is where the reverse proxy listens on a TCP port, and forwards traffic to an upstream TCP port, over TCP. This is what I use for my whitelisted Minecraft server.

The benefit of TLS termination is TLS is a lot easier to set up. The drawback is that you may only proxy HTTP traffic. Stream proxying, on the other hand, can proxy traffic over any TCP port and behaves almost exactly like port-forwarding in your home router. The drawback to stream proxying is it has no concept of vhosts like HTTP does, so you can’t proxy traffic over outside port 80 to multiple locations based on what domain name the user went to.

Per GitLab’s docs, for my use case, we must set up stream proxying of the outside ports 80 and 443, for HTTP and HTTPS respectively, and configure GitLab itself to handle TLS certificates.

TCP Nightmares

At this point, we have an extremely terrifying setup. We have a VPS that exposes several websites, and now must also expose GitLab Pages via a stream proxy. The problem is GitLab Pages needs to listen on ports 80 and 443, but the other exposed websites need to as well. How could I possibly get this to work? You can’t just have two things listening on the same port, on Linux or any operating system. Nginx’s HTTP proxy is listening on port 80 and 443, but our stream prixy also needs to listen on port 80 and 443, on the same device. This surely isn’t possible.

But remember, if you can read this, then I’ve already gotten all of this working. So clearly I did figure out how to listen on the same port twice on the same device. But…how the fuck?

…With money.

I use vultr as my VPS provider. I love them. When you deploy a VPS on vultr, you get a free static IPv4 address assigned to it as well as a static IPv6 one. You don’t get billed for these, they’re part of the VPS plan itself. So by default, you have one public IP address for IPv4 and one for IPv6. …But you can pay to have more! For an extra $2/month, my VPS now has two IPv4 addresses.

I won’t go over the absolute nucking futs process of actually configuring the VPS to use both addresses, because it is highly OS-dependent, but it’s annoyingly hard to do. If your VPS provider doesn’t teach you how to do it, congratulations - you now need to take an intoductory networking course so you know how to connect to the Internet manually without the use of DHCP. I spent hours troubleshooting this only to realize vultr guides you through it. Argh. Oh well. Wallet slightly melted and brain slightly fried, networking is set up.

So now we have two IP addresses, one for GitLab and one for GitLab Pages.

GitLab: 1.2.3.4 GitLab Pages: 1.2.3.5

Now we need a domain name

When you use GitHub Pages, you get a github.io domain. But when you use GitHub…GitHub, you visit it via github.com. They’re two different domains. With GitLab Pages, it’s the same idea. We need an equivalent of github.com or gitlab.com, and an equivalent of github.io or gitlab.io. We already have our own gitlab.com, which you know as being gitlab.acidiclight.dev. That’s where you create your repositories, manage your account, etc.

What I didn’t have already, was an equivalent of gitlab.io. So a quick trip to Porkbun and $5 later, and say hello to https://acidiclight.page :)

Setting up DNS

DNS is the easy part. But…you’ll need to use a DNS provider that certbot works with. In my case, I always use Cloudflare. We’ll need this for Let’s Encrypt later, so it’s important you know what provider you’re using for your domain’s DNS.

Assuming we already have a Cloudflare account for the domain acidiclight.page, which I do, we need to add two DNS records:

TypeNameValue
Aacidiclight.page1.2.3.5
CNAME*acidiclight.page

We’re pointing the root domain at my VPS’s secondary IP address, the one GitLab Pages will be exposed on. And we’re then creating a wildcard CNAME record, so that all traffic to all subdomains on the entire domain will be sent to the GitLab Pages server.

Make sure the cloud is gray, and not orange, as Cloudflare’s HTTP proxying will break GitLab Pages.

Setting up Let’s Encrypt for the Pages domain

Because we have a wildcard DNS record *.acidiclight.page, and we want HTTPS to just automatically work for all auto-generated GitLab Pages domains, we need a wildcard TLS certificate for the domain name. Wildcard certs are a bit harder to obtain than single-domain certs, because they imply a lot more trust. We can get one from Let’s Encrypt for free though, provided certbot is set up correctly.

The way you do this depends on your DNS provider. These instructions assume Cloudflare, so if you’re using something else, you’re on your own.

Set up Cloudflare

  1. Go to your Cloudflare account
  2. Click your user icon -> “My Profile”
  3. Go to API Tokens on the side
  4. Create a new one
  5. Use the “Edit zone DNS” template
  6. Make sure the token has the ability to edit the DNS of your GitLab Pages domain, i.e acidiclight.page.
  7. Note down the token securely. My fake one is RITCHIE.

Setting up Certbot

I’m setting Certbot up on the GitLab server in my home, and you should as well. I’m also assuming you’re using the Docker image of GitLab, because I am. That’ll become important later.

  1. Install Certbot and the Cloudflare plugin
sudo snap install certbot --classic
sudo snap set certbot trust-plugin-with-root=ok
sudo snap install certbot-dns-cloudflare --classic

Yes, I’m using Snap, and yes, I hate it.

  1. Tell it what the API key is
sudo mkdir /etc/letsencrypt
sudo nano /etc/letsencrypt/cloudflare.ini

Inside it, place:

dns_cloudflare_api_token = RITCHIE

…replacing “RITCHIE” with your actual API token.

  1. Get the certs issued.
sudo certbot certonly --dns-cloudflare --dns-cloudflare-credentials /etc/letsencrypt/cloudflare.ini -d '*.acidiclight.page' -d acidiclight.page

Replace acidiclight.page wqith your own domain, in both places.

If all goes well, Certbot will have issued a wildcard TLS certificate for your Pages domain, and stored it in the directory /etc/letsencrypt/live/yourdomain.com/.

Getting the certs into GitLab

If you’re NOT using Docker, then you can just symlink from the paths GitLab expects to find the certificates, to the ones Let’s Encrypt used to save them. You don’t want to manually copy the files, because the cert will expire every 3 months and Certbot will update them.

For reference, assuming acidiclight.page is the Pages domain, Let’s Encrypt places the cert files at:

  • Certificate: /etc/letsencrypt/live/acidiclight.page/fullchain.pem
  • Private key: /etc/letsencrypt/live/acidiclight.page/privkey.pem

GitLab will expect to find them at:

  • Certificate: /etc/gitlab/ssl/acidiclight.page.crt
  • Private key: /etc/gitlab/ssl/acidiclight.page.key.

Remember to change acidiclight.page in all paths to whatever your actual domain name is.

So, for a non-docker install,

ln -s /etc/letsencrypt/live/acidiclight.page/fullchain.pem /etc/gitlab/ssl/acidiclight.page.crt
ln -s /etc/letsencrypt/live/acidiclight.page/privkey.pem /etc/gitlab/ssl/acidiclight.page.key

For a Docker install, use bind mounts. This assumes Certbot is on the host.

It’s worth noting that, if you don’t do this now or you do it incorrectly, then GitLab will refuse to boot when we turn on GitLab Pages in the next sections.

Setting up the stream proxies

It’s now time to set up the proxies, so GitLab Pages is available to the world wide web.

Assumptions

I’m going to make a few networking assumptions.

Your VPS has two public IP addresses: 1.2.3.4 for GitLab and 1.2.3.5 for GitLab Pages.

Your home server is connected via Wireguard to your VPS. In the Wireguard network, your VPS has the local IP address 10.200.200.1 and your home server is 10.200.200.2.

GitLab is running on your home server.

Turning GitLab Pages on, on the home server

Edit your gitlab.rb configuration.

First, we turn GitLab Pages on as a feature.

gitlab_pages['enable'] = true

Next, tell GitLab what the default domain name for GitLab Pages sites will be. Make sure it uses https:// as well.

pages_external_url "https://acidiclight.page/";

Next, we force all Pages traffic to be secure:

gitlab_pages['redirect_http'] = true

Finally, we tell GitLab Pages what ports to listen on locally. I listen on 8970 for HTTP and 8971 for HTTPS.

If GitLab is running as a Docker container, then we bind to all interfaces on the container. You’d then expose the two ports to the host. I’m going to assume that you’ll expose the same outside ports as me, so 8970 for HTTP and 8971 for HTTPS.

gitlab_pages['external_http'] = ['0.0.0.0:8970']
gitlab_pages['external_https'] = ['0.0.0.0:8971']

If you are NOT using GitLab as a Docker container, then we will tell GitLab Pages to bind to the Wireguard interface on the home server.

gitlab_pages['external_http'] = ['10.200.200.2:8970']
gitlab_pages['external_https'] = ['10.200.200.2:8971']

In either case, once you restart GitLab, you will end up with a setup where GitLab Pages is listening on 10.200.200.2:8970 for HTTP and 10.200.200.2:8971 for HTTPs, and where GitLab Pages can only be accessed within the Wireguard network. If GitLab doesn’t boot at all,, check the logs, you missed a step.

Exposing GitLab to the Internet

Now, on the VPS, set up two new stream proxies.

streams {
    # GitLab Pages over HTTP
    server {
        listen 1.2.3.5:80; # The Internet
        proxy_pass 10.200.200.2:8970; # Wireguard
    }

    server {
        listen 1.2.3.5:443; # The Internet
        proxy_pass 10.200.200.2:8971; # Wireguard
    }


}

Note specifically the listen directives, instead of just specufying a port to listen on, we specifically specify a public IP address to listen on. This will make nginx only accept traffic through these stream proxies if the traffic is coming in through that specific IP address.

You would then ensure ALL other nginx hosts use your main IP address. Use nginx -t to test your config.

Let’s try it out!

If all goes well, you should be able to visit your GitLab Pages domain and get a 404 with a GitLab logo on it. That’s awesome, you’re done the evil part.

You should now be able to create a new repository in GitLab, and:

# in your repository, locally:
mkdir public
echo "Hello world!" > public/index.html

Then, in .gitlab-ci.yml:

pages:
  script:
    - echo Static deployment
  artifacts:
    paths:
      - public

Commit these, and wait a bit, and you have a Pages site. Right? Well…probably not. You need a GitLab CI/CD instance runner for this to go smoothly. So go set up DevOps the fun way, and add your GitLab runner as an instance runner in the admin area. Then try doing GitLab Pages again.

When you set up the instance runner, or if you had one already, then you should be able to go to Deploy -> Pages in the repository, and see the Pages settings. This’ll give you a way to view the public site. If all goes well, it’ll be encrypted using your Let’s Encrypt wildcard cert.

Custom domains and their caveats

It’s one thing o have set up our equivalent of gitlab.io with HTTPS, like we have. But it’d be nice to have support for custom domains, like sociallydistantgame.com, for example.

If you’ve been following this article to the letter, your work is done as an instance admin. Your Pages server already handles custom domains just fine. But, as a user, here’s some things you need to consider.

  1. Just like GitLab.com, you need to own your custom domain and you need to verify it. As an instance admin, you can turn this off, but that’s dumb. Don’t. Without domain verification, any user on your GitLab server can create a Pages site and take over your default domain (for example, by setting acidiclight.page as a custom domain).
  2. You may need to agree to the Let’s Encrypt TOS in the admin area. Go to Admin Area -> Settings -> Preferences -> Pages, and give Let’s Encrypt your email and agree to the legal shit, like you had to do with Certbot earlier. This allows GitLab Pages to issue certs for custom domains.
  3. Let’s Encrypt takes a WHILE to issue certs for custom domains. During this time, you will see scary SSL errors on the site with the custom domain, but give it half an hour or so and it’ll fix itself.

So that was fun.

Now…what am I to do about Socially Distant Windows builds…