Why We Use Cloudfront with Lambda

Krisztina Szilvasi Istvan Szukacs · 2022/06/28

Table of contents

Introduction

There are many way of using AWS Lambda functions. We found a way that enables us to both leverage global distribution with Cloudfront and the newer, simpler v2 version of API GW. These combined with Lambda result in a fast and secure serverless web application that is simple to deploy and maintain. To take simple infrastructure to another level, we also added Terraform to our toolkit, automating the deployment of the infrastructure.

Why Terraform

Terraform allows engineers to automate cloud infrastructure and configuration, and manage it as code in one place. Managing hundreds of resources on the AWS console can be overwhelming and slow. The UI is inconsistent, and there is no way to have a logical overview of all resources. The situation gets even more complicated if a company uses resources from more than one cloud provider. Terraform provides a consistent view, and all resources are just one click away from each other even if they are from different providers. What can make navigation even easier is the folder structure that companies can plan according to their best practices. Another advantage of Terraform is reusability. Terraform modules are a way to package and reuse configurations. Modules contain multiple resources that are used together allowing resources to be gathered in a logical and practical way.

Implementation

Now, let us go through how we implemented the infrastructure of Depoxy, our ETL management application.

Structuring Code in Terraform

  • tf: Folder that contains all Terraform related code
    • ci: Folder that contains the CI/CD related resources
    • dev: Folder that contains the development stage resources
      • api: Folder that contains the api configuration, this is the backend for our app
      • www: Folder that contains the www configuration, this is the landing page
      • app: Folder that contains the app configuration, this is the JavaScript PWA
    • prod: Folder that contains the production stage resources
      • api: same as above
      • www: same as above
      • app: same as above

Each subfolder has its own backend and provider. This means we can deploy changes to stages independently from each other. There cannot be any interference between the stages, and the chance that somebody else is doing changes to that particular environment is minimal. Since the environments are so small, the terraform plan and terraform apply take less than a minute. This allows us to fix issues very quickly and promote changes from dev to prod. The CI folder resources are a bit different than dev or prod becuase CI/CD needs to do different things (like update a Lambda function for example) than the rest of the environments.

Having specific environments gives us the opportunity to apply the least privilige principle. Dev stage can access only dev resources, dev credentials, dev S3 buckets etc.

Modules

Modules (reusable resources) are placed in a different repo, and we reference them using the S3 support of Terraform. Modules can be viewed as templates of a resources with variables that will be assigned a value in the actual API configuration files. Here you can see our three modules for the APIs:

AWS Lambda Module

  resource "aws_lambda_function" "lambda-function" {

    function_name    = "${var.function-name}-function"
    description      = "${var.function-name}-function"
    s3_bucket        = var.s3-bucket
    s3_key           = var.s3-key
    source_code_hash = var.source-code-hash
    role             = var.role-arn
    handler          = var.handler
    layers           = var.layers
    memory_size      = var.memory-size
    runtime          = var.runtime
    timeout          = var.timeout

    dynamic "environment" {
      for_each = length(var.environment-variables) > 0 ? [true] : []

      content {
        variables = var.environment-variables
      }
    }
  }

Using the AWS Lambda Module

The modules are version controlled and deployed to a S3 bucket using CI/CD. The configuration has two parts: how we would like to have the AWS Lambda function to behave (runtime is Python for example) and how we would like to have the actual Python process to be configured (which SecretId to use for the JWT generation, etc.)

module "lambda-function" {
  source           = "s3::https://s3-eu-west-1.amazonaws.com/datadeft-tf/modules/lambda/0.0.3/lambda.zip"
  role-arn         = module.lambda-role.role-arn
  s3-bucket        = var.lambda-function-s3-bucket-name
  s3-key           = "api/${var.lambda-function-version}/api.zip"
  source-code-hash = chomp(data.aws_s3_object.lambda-function-hash.body)
  function-name    = replace(var.domain-name, ".", "-")
  runtime          = "python3.9"
  handler          = "app.handler"
  memory-size      = 2048
  timeout          = 10
  layers           = var.lambda-layers

  environment-variables = {
    DEPOXY_API_COOKIE_MAX_AGE_DAYS = "1"
    DEPOXY_API_COOKIE_SECURITY     = true
    DEPOXY_API_CORS_ALLOW_ORIGIN   = "https://app.dev.depoxy.dev"
    DEPOXY_API_HONEYCOMB_SECRET_ID = "dev/depoxy/api/honeycomb"
    DEPOXY_API_JWT_ALGORITHM       = "ES256"
    DEPOXY_API_JWT_AUDIENCE        = "DepoxyDev"
    DEPOXY_API_JWT_SECRET_ID       = "dev/depoxy/api"
    DEPOXY_API_S3_BUCKET           = "depoxy-dev"
    DEPOXY_API_S3_PREFIX           = "api"
    DEPOXY_API_SES_SENDER_ADDRESS  = "login@send.depoxy.dev"
    DEPOXY_API_STAGE               = "dev"
  }
}

The Complete Picture

We have Terraform as a means to deploy the cloud infrastructure. We need to have the following resources:

  • CDN: distributing static files, JavaScript, CSS, HTML, images etc.
  • Http load-balancer
  • Code execution
  • Persistence layer

We leave out data warehousing and ETL for now. That goes to a separate article.

Implementation

We picked the following services:

  • CDN: AWS CloudFront
  • Http load-balancer: AWS API Gateway (v2)
  • Code execution: AWS Lambda
  • Persistence: AWS S3

Putting It All Together

I think it is easier to understand how these services are put together when explained in a picture.

Infra

Why We Use CloudFront With Lambda Though?

After a bit of a detour into how the infra is deployed, we can tackle the question: why we use Lambda this way. There are few reasons.

First, we need to control where the data is stored because of GDPR and our European customers. We would like to store data in the EU. That means we have a single-region Lambda deployment, which is backed by a single-region data persistence. So we can guraantee that the data never leaves the EU. It is also possible to deploy the stack to a different region when a customer prefers that. Running the Lambda in a single region raises the question of latency for the API calls.

We can mitigate that by putting a CloudFront distribution in the front of Lambda and utilizing all the points of presence AWS has. The request enters AWS’s infra at the earliest possible point and travels through the AWS backbone. It gives us decent latency distributions, which is our second reason to use them together.

We are aware that there are many different ways of using Lambda. AWS introduced new ways recently. However, our current setup prove to be useful for us. Our goal was to create fast and secure applications for our customers at a sustainable price, but we intended not to compromise on “developer-friendliness” either.

Deploying API Gateway with Cloudfront reduces latency around the world and provides security options that API Gateway itself does not offer. The second version also enables authorizers, JWT configuration, CORS headers, and single-source API endpoints with minimal configuration. In addition, AWS Lambda – acting as a serverless backend – handles capacity, scaling, and patching enabling us to focus on development and deliver great user experience.

About Us

We are small consultancy hailing from Europe working on projects mostly in the cloud and data engineering space. If you are interested in talking to us, reach out on hello at datadeft dot eu.

Back to blog posts