Home

The First Iteration

It was the 26th of December 2019, that time of year when you sit back and think "I need a new coding project to write about". And thus I have decided to start building the Viable Blog. In this project I am developing its platform from scratch, sharing the sourcecode, and simultaneously blogging about its developments and discoveries.

What is the goal of the Viable Blog? Well, as the late Eliyahu M. Goldratt taught us, the goal of an endeavor is to make money. At least it should be, so if you find the Viable Blog helpful please do buy me a coffee, as this will help motivate me. Okay now that's out there, let's begin the journey.

In the endeavor of the Viable Blog, I want to build a publishing platform that exhibits modern best practices, for example in the discipline of DevOps. I want to share the experience of developing it with you, play by play, such that you can improve the projects you work on.

The objective of the first iteration of the Viable Blog is to have just enough code and infrastructure to ship this post. The minimum viable requirements to kick start the first iteration are:

Also, I want to embrace infrastructure-as-code from the very start.

Let's Begin to Implement The Viable Blog

REQ1 Maintain a sourcecode repository

This project began with the creation of a public GitHub repository. I then cloned that repository to my Ubuntu development environment, where I initiated the project as follows.

mkdir the-viable-blog
cd the-viable-blog
echo "# the-viable-blog" >> README.md
git add README.md
git commit -m "first commit"
git remote add origin [email protected]:GeoffHayward/the-viable-blog.git
git push -u origin master

In order to keep my IDE's configuration out of Git I added a .gitignore file with two Intellij entries.

echo '.idea/' > .gitignore
echo '*.iml' >> .gitignore
git add .
git commit -m "Adds a root .gitignore file"
git push origin master

Then the real work on the first iteration started with the creation of a new branch named first-iteration.

git checkout -b first-iteration
git push --set-upstream origin first-iteration

A branch, even at this stage, creates a logical distinction between what is shippable (the master branch) from the work in progress (within the first-iteration branch).

Serving Web Pages with S3 via CloudFlare

REQ2 Serve web pages

Now the fun starts. For the first iteration of the Viable Blog I am using AWS's S3 as the web server. For now the Viable blog is very much an MVP. The first iteration consists of this post, an index page and a few basic files expected by Google, such as robots.txt.

I configured an S3 bucket to be a web server. Below is a copy of the CloudFormation template I used to create a bucket that is named 'viable.blog'. For S3 to act as a web server the S3 bucket's name must match the domain. AWS CloudFormation is a free infrastructure-as-code service from AWS (note the infrastructure itself is not free).

Resources:
  S3Bucket:
    Type: 'AWS::S3::Bucket'
    Properties:
      BucketName: 'viable.blog'
      PublicAccessBlockConfiguration:
        BlockPublicAcls: TRUE
        BlockPublicPolicy: TRUE
        IgnorePublicAcls: TRUE
        RestrictPublicBuckets: TRUE
      WebsiteConfiguration:
        IndexDocument: index
        ErrorDocument: error
  BucketPolicy:
    Type: 'AWS::S3::BucketPolicy'
    Properties:
      PolicyDocument:
        Id: AccessPolicy
        Version: 2012-10-17
        Statement:
          - Sid: PublicReadForGetBucketObjects
            Effect: Allow
            Principal: '*'
            Action: 's3:GetObject'
            Resource: !Join
              - ''
              - - 'arn:aws:s3:::'
                - !Ref S3Bucket
                - /*
            Condition:
              IpAddress:
                aws:SourceIp:
                  - '2400:cb00::/32'
                  - '2606:4700::/32'
                  - '2803:f800::/32'
                  - '2405:b500::/32'
                  - '2405:8100::/32'
                  - '2a06:98c0::/29'
                  - '2c0f:f248::/32'
                  - '173.245.48.0/20'
                  - '103.21.244.0/22'
                  - '103.22.200.0/22'
                  - '103.31.4.0/22'
                  - '141.101.64.0/18'
                  - '108.162.192.0/18'
                  - '190.93.240.0/20'
                  - '188.114.96.0/20'
                  - '197.234.240.0/22'
                  - '198.41.128.0/17'
                  - '162.158.0.0/15'
                  - '104.16.0.0/12'
                  - '172.64.0.0/13'
                  - '131.0.72.0/22'
      Bucket: !Ref S3Bucket

The bucket is configured to serve the content only when requested through a CloudFlare edge location (that's the long list of IP addresses). I had to explicitly set the PublicAccessBlockConfiguration sub-elements back to what is normally their defaults, however this does not seem to be the default case when a CloudFormation template has the WebsiteConfiguration element set for an S3 resource.

Once the bucket was previsioned, I deployed the following index.html holding page.

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>The Viable Blog</title>
    </head>
    <body>
        <h1>The Viable Blog</h1>
        <p>Testing testing 1 2 3.</p>
    </body>
</html>

We are not quite ready for automated deployment yet, however the holding page was deployed in a controlled repeatable manner - enter Make. 'Make' can help a team move towards a controlled repeatable process because a team can put project specific commands inside a Makefile. A Makefile can and should then become a part of the project's source repository.

'Make' was originally designed for managing the compilation of programs written in C. However because of Make's versatility it has become a popular tool outside of C. Once a project has its own Makefile all a developer needs to do is type make [tab] to see a list of labeled project specific commands.

Here is a copy of the first entry added to the Viable Blog's Makefile.

deploy-distribution:
	aws s3 sync ./distribution/ s3://viable.blog/ --delete

When make deploy-distribution is run, the S3 bucket receives a copy of all the files and folders in the distribution folder. The --delete flag instructs S3 to remove the files that are no longer in the distribution folder. In other words this command instructs S3 to mirror the distribution folder.

Before running make deploy-distribution, the AWS CLI needed to be installed and configured with credentials that have the AmazonS3FullAccess policy attached. In my case I attached the policy to an AMI group called 'the-viable-blog-developer' and then created a user, that had programmatic access, with 'the-viable-blog-developer' group attached.

sudo apt install awscli
aws configure

With an S3 bucket that has a holding page deployed to it, it was time to configure the DNS server. I am using CloudFlare's DNS server. I am a big fan of CloudFlare as both a DNS server and as a very affordable (if not free) CDN. Also CloudFlare provides free managed SSL certificates. Below is a copy of the Terraform I used to configure CloudFlare.

provider "cloudflare" {
  version = "~> 2.0"
}

variable "cloudflare_zone_id" {
  description = "The Domain's CloudFlare Zone ID"
  type = string
}

resource "cloudflare_record" "root_dns_settings" {
  zone_id = var.cloudflare_zone_id
  name = "viable.blog"
  value = "viable.blog.s3-website.eu-west-2.amazonaws.com"
  type = "CNAME"
  proxied = true
}

resource "cloudflare_zone_settings_override" "settings" {
  zone_id = var.cloudflare_zone_id
  settings {
    ssl = "flexible"
    always_use_https = "on"
    automatic_https_rewrites = "on"
  }
}

As you can see, I used the latest CloudFlare provider for Terraform (v2 at the time of writing). Next I declared the variable named 'cloudflare_zone_id' (CloudFlare's terminology for a domain is 'Zone ID'). Having the Zone ID as a variable has these advantages:

The 'root_dns_settings' block instructs CloudFlare to resolve the root of the domain to the S3 bucket (i.e. value = "viable.blog.s3-website.eu-west-2.amazonaws.com"). By setting proxied = true, CloudFlare's awesomeness is switched on for traffic served from the root. CloudFlare's awesomeness includes HTTP/2, asset caching, DDoS mitigation and so much more. Checkout CloudFlare's features page for more information.

The 'cloudflare_zone_settings_override' block instructs CloudFlare to redirect all HTTP request to HTTPS. This means the page you are reading was delivered to you from the CloudFlare service encrypted. However, with this configuration, AWS S3 does not support HTTPS. I therefore had to set ssl = "flexible" so that CloudFlare can make an unencrypted connection with the S3 bucket. Never transmit sensitive data unencrypted!

I then updated the Makefile with the following.

update-cloudflare:
	cd infrastructure/ && terraform apply -auto-approve

This command first moves the shell from the root of the project (where the Makefile lives) into the infrastructure folder. The command then instructs Terraform to apply all the *.tf configurations within the folder. The flag -auto-approve command is likely to make the make update-cloudflare command more automation friendly within a Continuous Delivery pipeline in a later iteration.

Terraform needs to be on the $PATH before you can run make update-cloudflare and the following CloudFlare environment variables need to be configured.

export CLOUDFLARE_EMAIL="<REDACTED>"
export CLOUDFLARE_ACCOUNT_ID="<REDACTED>"
export CLOUDFLARE_API_TOKEN="<REDACTED>"

The API token needs to have permission to edit the domain's DNS and DNS setting. Here is a screen grab of the configuration used.

A screen grab of CloudFlare's API Token creation web page that is set to 'edit DNS' and 'edit DNS settings' for the zone of 'viable.blog'.

A Note on Storing CloudFlare Tokens Safely

For security and auditing reasons each entity on a software development team should have there own, role specific, CLOUDFLARE_API_TOKEN. An entity could be a fellow human or an automation bot. In contrast, the value of cloudflare_zone_id is in the centre of the privacy spectrum. It should be kept out of source control but safely shared between entities on the team. Therefore, in this project, the value given to cloudflare_zone_id is stored in settings.tfvars. And I set Git to ignore settings.tfvars.

Delivering web pages: Markdown to HTML

REQ3 Deliver web pages in HTML

As the source of this page is formatted in Markdown, to ship the first iteration of the Viable Blog, a Markdown to HTML tool was needed. Why Markdown? Experience has led me to believe HTML is not a great persistent source format for web based publishing. Therefore, I decided the source of the content will be written to disk in Markdown.

Let's keep things simple for now an use cmark to convert the source Markdown to HTML. It's a command line tool that implements the CommonMark specification.

To keep things extra simple, I installed 'cmark' via APT. The version on APT is a few years old, but will do for now.

sudo apt install cmark

Once 'cmark' was installed all that was needed to convert Markdown to HTML was:

cmark -t html --safe content/2020/01/30/first-iteration.md > distribution/2020/01/30/first-iteration.html

The -t html argument asks cmark to produce HTML output and the --safe flag instructs cmark to suppress raw HTML and dangerous URLs being injected into any web pages.

Cmark does not add a boilerplate HTML skeleton to it's own output. That means there is no head or body element and no stylesheets, scripts, menus, footers, etc.

In order to serve HTML compliant web pages, I needed to wrap the content in a template. I decided to experiment with using a CloudFlare Worker to apply a template dynamically. CloudFlare Workers are scripts at the edge. In my case I created a Worker that proxied all requests to the S3 bucket and wrapped the HTML responses in a dynamic template. Here is a copy of the Worker script that dynamically templates the content.

addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request));
})

async function handleRequest(request) {
    const response = await fetch(request);

    if(response.status != 200 && response.status != 404) {
        return new Response(response.body, response);
    }

    if(response.headers.has("Content-Type") && !response.headers.get("Content-Type").includes("text/html")) {
        return new Response(response.body, response);
    }

    const text = modifyForBootstrap(await response.text().then(function(r){
        return r;
    }));

    const year = new Date().getFullYear();

    const template = `
    <html lang="en">
    <head>
    <!-- Required meta tags -->
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <!-- Bootstrap CSS -->
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/css/bootstrap.min.css">
    <title>The Viable Blog</title>
    </head>
    <body>
    <div class="container">
    <header>
      <a href="/">Home</a>
    </header>
    <hr />
    ${text}
    <footer>
      &copy; ${year}
    </footer>
    </div>
    </body>
    </html>`;

    return new Response(template, response);
}

function modifyForBootstrap(text) {
    return text.replace(/<img/g, "<img class=\"img-fluid\"");
}

In order to deploy the worker, using infrastructure-as-code, I updated the Terraform. Here is the CloudFlare Worker Terraform I appended to the terraform given above.

resource "cloudflare_worker_script" "viable_blog_script" {
  name = "viable-blog-worker"
  content = file("../template/cloudflare-worker.js")
}

resource "cloudflare_worker_route" "viable_blog_worker_route" {
  zone_id = var.cloudflare_zone_id
  pattern = "viable.blog/*"
  script_name = cloudflare_worker_script.viable_blog_script.name
}

The resource named 'viable_blog_script' uploads the cloudflare-worker.js file to CloudFlare as a CloudFlare Worker, giving it the name 'viable-blog-worker'. The resource named 'viable_blog_worker_route' asks CloudFlare to run the worker on all traffic that matches the patten /* (i.e. everything).

As you can see the handleRequest function only processes responses from S3 which have a 200 or 404 response code. I discovered that I needed to be quite specific because the S3 bucket sends a lot of 304 not modified responses, which is great for reducing network usage, but was breaking the worker. Also, as you can see, the handleRequest function only processes HTML pages. If the response is HTML and is a 200, then the worker wraps the response body with the HTML template shown above.

Finally, I had to update the permissions associated with the API key to include the 'Worker Scripts' and 'Worker Routes' permissions in order to deploy the worker to CloudFlare.

Let's Ship the Viable Blog

Okay, so at this point the configuration of the Viable Blog meets the MVP's requirements. And it's time to ship the first iteration of the Viable Blog along with this first post. I hope you have enjoy reading this post as much as I have enjoyed putting it together for you. Please remember to buy me a coffee if you found this post helpful.

What's in the next post? Now that the first iteration of the Viable Blog has shipped, it's time to set up monitoring. Until then, I wish you happy and prosperous coding.