DEV Community

Gabriel Sadaka
Gabriel Sadaka

Posted on • Edited on

Deploying a .NET Core API to Google Cloud Run with Cloud SQL Postgres using Pulumi and Batect

Motivation

Having never worked with GCP or Pulumi, I thought it would be fun to explore the technologies by building a .NET Core API following as many best practices as possible for production ready cloud native services. I also wanted to explore more about Batect, which is a tool that I have used in past projects but have never utilized in a greenfield project. The intention was to develop an API that implemented every best practice I could think of for developing cloud native services and then share what I learnt throughout the process. Rather than wait until I have completed the journey I have decided to share my learnings so far.

What is it?

The API is a simple .NET Core Web API that creates and retrieves weather forecasts. The forecasts are stored in a GCP Cloud SQL Postgres database while the API is deployed as a Docker container running on GCP Cloud Run. It has tests written at every level of the test pyramid relevant to the component, starting at unit tests for the business logic, controllers and repository layer, leading to integration tests that run against a Postgres docker container.

GCP Secret Manager is used to store the database password which is accessed by the .NET Core API and the database schema migration tool. The database schema migrations are applied using Flyway running in a docker container, defined as a task in Batect. They are applied during deployment and every time the integration tests are run to ensure a clean database is available for each test run.

The source code is hosted on GitHub using GitHub actions to automatically build and test the application then deploy it and its infrastructure to GCP using Pulumi. It uses Batect to define the tasks that are run during the CI/CD pipeline to enable them to run consistently during local development and in the GitHub actions.

There are a few tools used for static analysis and for security/dependency scanning to ensure the application has no known vulnerabilities. The ASP.NET Core API uses SonarAnalyzer.CSharp and Microsoft.CodeAnalysis.NetAnalyzers both of which are static analysis libraries for C# that detect code patterns that introduce security vulnerabilities or other flaws such as memory leaks. GitHub CodeQL is another static analysis tool that is used as a GitHub action. Dependabot is used for scanning the .NET dependencies to ensure they are up to date. Trivy is used to scan the .NET Core Docker image for known vulnerabilities while Dockle is a Dockerfile linter for detecting security flaws.

What does it look like?

The Build Pipeline

Although the code is hosted on GitHub and uses GitHub actions to automate the building and deployment, it makes use of little GitHub actions specific functionality. Instead it uses Batect to define what happens at each step, enabling the developer to validate that it will work locally before pushing the code and waiting to see if it will succeed. It also ensures that the tasks can be run consistently on any machine with little setup required, removing the concern that arises from "doesn't run on my machine".

To build the application the standard .NET Core SDK base image is defined as a container in Batect, along with volumes to map the code, the nuget cache and the obj folder to optimise the performance of subsequent builds by caching intermediate artefacts. A task is also defined that will use that container to build the API.

containers:
---
build-env:
  image: mcr.microsoft.com/dotnet/core/sdk:3.1
  volumes:
    - local: .
      container: /code
      options: cached
    - type: cache
      name: nuget-cache
      container: /root/.nuget/packages
    - type: cache
      name: weatherApi-obj
      container: /code/src/WeatherApi/obj
    - type: cache
      name: weatherApi-tests-obj
      container: /code/src/WeatherApi.Tests/obj
  working_directory: /code
---
tasks:
---
build:
  description: Build Weather API
  run:
    container: build-env
    command: dotnet build
Enter fullscreen mode Exit fullscreen mode

To run the integration tests the same build-env Docker container is used along with a Docker container for Postgres which will run the schema migrations using Flyway as a setup command. A task is defined that will run the tests that depends on the Postgres container becoming healthy which will only happen once the migrations are run. The integration tests can be run locally the exact same way they are run in the build pipeline to enable the developer to be confident that the changes they have made won't result in a broken build.

containers:
---
postgres:
  build_directory: db/tests
  ports:
    - local: 5432
      container: 5432
  setup_commands:
    - command: /scripts/run-migrations-integration-tests.sh
  volumes:
    - local: db/migrations
      container: /migrations
      options: cached
    - local: db/scripts
      container: /scripts
      options: cached
  environment:
    POSTGRES_DB: weather_db
    POSTGRES_USER: weather_user
    POSTGRES_PASSWORD: $POSTGRES_PASSWORD
    POSTGRES_HOST: postgres
    POSTGRES_PORT: "5432"
    POSTGRES_SCHEMA_NAME: public
    POSTGRES_MIGRATIONS_LOCATION: filesystem:/migrations
---
tasks:
---
test:
  description: Test Weather API
  run:
    container: build-env
    command: dotnet test
    environment:
      ASPNETCORE_ENVIRONMENT: test
      WEATHERDB__PASSWORD: $POSTGRES_PASSWORD
  dependencies:
    - postgres
Enter fullscreen mode Exit fullscreen mode

The GitHub action now only needs to run the Batect task called test which will trigger Postgres to start, the Flyway migrations to run and then the API unit and integration tests to run. It uses the default Ubuntu image, first checking out the code then calling Batect to run the task.

jobs:
---
build-test:
  runs-on: ubuntu-latest
  steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Build and test application
      run: ./batect test
      env:
        POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
---

Enter fullscreen mode Exit fullscreen mode

The Infrastructure

Deploying the infrastructure is performed in two steps by two different GCP accounts. There is an account that has the permissions to make IAM changes and enable GCP APIs named, iam-svc. The IAM account is created manually using gcloud CLI commands prior to running the first deployment of the application, there are instructions in the readme of the repository to create this account. The IAM account then creates a CI account named, ci-svc, with only the permissions it requires to deploy the infrastructure and deploy the application to GCP Cloud Run, along with an account named, weather-api-cloud-run, that is the service account the Cloud Run app runs as. The Cloud Run service account has only the permissions it needs to run the application, access the database password secret and the database itself.

The deployment of the IAM changes, infrastructure and the app itself are all performed by Pulumi. The tasks performed in the GitHub Actions like the build and test task are all defined as containers/tasks in Batect.

containers:
---
pulumi:
  image: pulumi/pulumi:v2.12.1
  working_directory: /app
  volumes:
    - local: infra
      container: /app
      options: cached
    - local: /var/run/docker.sock
      container: /var/run/docker.sock
  environment:
    APP_NAME: $APP_NAME
    GOOGLE_CREDENTIALS: $GOOGLE_CREDENTIALS
    GOOGLE_SERVICE_ACCOUNT: $GOOGLE_SERVICE_ACCOUNT
    GOOGLE_PROJECT: $GOOGLE_PROJECT
    GOOGLE_REGION: $GOOGLE_REGION
    PULUMI_ACCESS_TOKEN: $PULUMI_ACCESS_TOKEN
    GITHUB_SHA: $GITHUB_SHA
  entrypoint: bash

tasks:
---
deploy-iam:
  description: Deploy IAM using Pulumi
  run:
    container: pulumi
    command: deploy-iam.sh
deploy-infra:
  description: Deploy infra using Pulumi
  run:
    container: pulumi
    command: deploy-infra.sh
    environment:
      DB_NAME: $DB_NAME
      DB_USERNAME: $DB_USERNAME
Enter fullscreen mode Exit fullscreen mode

The two GCP accounts, the IAM changes and the enabling of the GCP APIs are declared using the GCP Pulumi library, defined in the deploy-iam task. Like Terraform, Pulumi maintains the latest state of the deployment, comparing that to the declared state in the latest commit, applying only the changes that were made.

...
const ciServiceAccount = new gcp.serviceaccount.Account(`ci-svc`, {
    accountId: `ci-svc`,
    description: `CI Service Account`,
    displayName: `CI Service Account`
}, {dependsOn: enableGcpApis.enableIamApi});

const storageAdminIamBinding = new gcp.projects.IAMBinding(`ci-svc-storage-admin`, {
    members: [ciServiceAccountEmail],
    role: "roles/storage.admin"
}, {parent: ciServiceAccount, dependsOn: enableGcpApis.enableIamApi});

...

export const enableCloudRunApi= new gcp.projects.Service("EnableCloudRunApi", {
    service: "run.googleapis.com",
});
Enter fullscreen mode Exit fullscreen mode

The only infrastructure deployed for this application is the container registry that stores the build Docker images that GCP Cloud Run depends on and the GCP Cloud SQL Postgres instance that the application writes and reads weather forecasts from. They are both defined using Pulumi, defined in the deploy-infra task.

const registry = new gcp.container.Registry("weather-registry");

...

export const databaseInstance = new gcp.sql.DatabaseInstance(`${config.appName}-db`, {
    name: `${config.appName}-db`,
    databaseVersion: "POSTGRES_12",
    settings: {
        tier: "db-f1-micro",
        ipConfiguration: {
            ipv4Enabled: true,
            requireSsl: true
        }
    },
});

export const database = new gcp.sql.Database(`${config.appName}-db`, {
    name: config.dbName,
    instance: databaseInstance.id
});
Enter fullscreen mode Exit fullscreen mode

The Database

The database schema migrations are performed by Flyway running in a Docker container, defined as a Batect container/task. The task first calls GCP Secrets Manager to retrieve the database password then connects to the Cloud SQL Postgres instance using the Cloud SQL Proxy and then performs the schema migrations.

containers:
---
flyway-migrator:
  build_directory: db
  volumes:
    - local: db/migrations
      container: /migrations
      options: cached
    - local: db/scripts
      container: /scripts
      options: cached
  environment:
    DB_PORT: "5432"
    DB_SCHEMA_NAME: public
    DB_MIGRATIONS_LOCATION: filesystem:/migrations
---
tasks:
---
migrate-db:
  description: Run database migrations
  run:
    container: flyway-migrator
    entrypoint: /scripts/run-migrations-gcloud.sh
    environment:
      GOOGLE_PROJECT: $GOOGLE_PROJECT
      GOOGLE_REGION: $GOOGLE_REGION
      GOOGLE_CREDENTIALS: $GOOGLE_CREDENTIALS
      DB_INSTANCE: $DB_INSTANCE
      DB_HOST: localhost
      DB_NAME: $DB_NAME
      DB_USERNAME: $DB_USERNAME
      DB_PASSWORD_SECRET_ID: $DB_PASSWORD_SECRET_ID
      DB_PASSWORD_SECRET_VERSION: $DB_PASSWORD_SECRET_VERSION
Enter fullscreen mode Exit fullscreen mode

The Deployment

Build, scan and publish image

To run the application in GCP Cloud Run a Docker image needs to be built and deployed to a container registry that is accessible by the service. I chose GCP Container Registry since it was the easiest to get working with Cloud Run. Just like the previous tasks the build, scan and publish image step is defined as a Batect container/task using the GCP SDK Alpine image.

containers:
---
gcloud-sdk:
  image: google/cloud-sdk:314.0.0-alpine
  working_directory: /app
  volumes:
    - local: .
      container: /app
      options: cached
    - local: /var/run/docker.sock
      container: /var/run/docker.sock
  environment:
    APP_NAME: $APP_NAME
    GOOGLE_CREDENTIALS: $GOOGLE_CREDENTIALS
    GOOGLE_PROJECT: $GOOGLE_PROJECT
    GITHUB_SHA: $GITHUB_SHA
---
tasks:
---
build-scan-image:
  description: Build and scan docker image for vulnerabilities
  run:
    container: gcloud-sdk
    command: infra/build-scan-image.sh
push-image:
  description: Push docker image to GCR
  run:
    container: gcloud-sdk
    command: infra/push-image.sh
Enter fullscreen mode Exit fullscreen mode

After the image is built, it is scanned for vulnerabilities using Trivy and Dockle to ensure that there are no dependencies with known vulnerabilities and that the Dockerfile follows the best security practices. One of the vulnerabilities Dockle identified for me was that I forgot to change the running user at the end of the Dockerfile so that the application didn't run as the root user.

IMAGE_TAG=gcr.io/"$GOOGLE_PROJECT"/"$APP_NAME":"$GITHUB_SHA"

docker build . -t "$IMAGE_TAG"

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock aquasec/trivy:0.12.0 \
    --exit-code 1 --no-progress --severity CRITICAL "$IMAGE_TAG"

docker run --rm  --rm -v /var/run/docker.sock:/var/run/docker.sock -i goodwithtech/dockle:v0.3.1 \
     --exit-code 1 --exit-level warn "$IMAGE_TAG"
Enter fullscreen mode Exit fullscreen mode
---
FROM mcr.microsoft.com/dotnet/core/aspnet:3.1 AS runtime
WORKDIR /app

COPY --from=build /app/out ./

RUN useradd -m -s /bin/bash dotnet-user
USER dotnet-user

ENV ASPNETCORE_URLS=http://*:8080

ENTRYPOINT ["dotnet", "WeatherApi.dll"]
Enter fullscreen mode Exit fullscreen mode

Deploy app to GCP Cloud Run

Once the API is built and published to the Container Registry it is then deployed to GCP Cloud Run using Pulumi, which is again defined as a Batect container/task. It uses the same Pulumi container definition declared above for the IAM and infrastructure tasks.

tasks:
---
deploy:
  description: Deploy Weather API to GCP Cloud Run using Pulumi
  run:
    container: pulumi
    command: deploy-app.sh
    environment:
      GOOGLE_RUN_SERVICE_ACCOUNT: $GOOGLE_RUN_SERVICE_ACCOUNT
      DB_INSTANCE: $DB_INSTANCE
      ENVIRONMENT: $ENVIRONMENT
Enter fullscreen mode Exit fullscreen mode

The GCP Cloud Run service is defined to use the container port 8080, as the default port of 80 cannot be bound to by unprivileged users. It is also defined with a dependency on the GCP Cloud SQL Postgres instance, enabling it to access it via the Cloud SQL Proxy. It is set to be visible to all users and is therefore public by default, this can be changed to only allow authenticated GCP users if needed.

const weatherApi = new gcp.cloudrun.Service(appName, {
  location,
  name: appName,
  template: {
    spec: {
      containers: [
        {
          image: `gcr.io/${gcp.config.project}/${appName}:${gitSha}`,
          envs: [
            {
              name: "GOOGLE_PROJECT",
              value: gcp.config.project,
            },
            {
              name: "ASPNETCORE_ENVIRONMENT",
              value: environment,
            },
          ],
          ports: [
            {
              containerPort: 8080,
            },
          ],
        },
      ],
      serviceAccountName: googleCloudRunServiceAccount,
    },
    metadata: {
      annotations: {
        "autoscaling.knative.dev/maxScale": "2",
        "run.googleapis.com/cloudsql-instances": cloudSqlInstance,
      },
    },
  },
});

// Open the service to public unrestricted access
const iamWeatherApi = new gcp.cloudrun.IamMember(`${appName}-everyone`, {
  service: weatherApi.name,
  location,
  role: "roles/run.invoker",
  member: "allUsers",
});
Enter fullscreen mode Exit fullscreen mode

The API

The API consists of a single controller that exposes two simple endpoints, one to create weather forecasts and another to retrieve those forecasts. It uses MediatR to implement a CQRS style architecture where each query and command has separate classes to represent the data transfer objects and the handlers.

[HttpGet]
public async Task<IActionResult> Get([FromQuery] GetWeatherForecastQuery query, CancellationToken ct = default)
{
    var response = await _mediator.Send(query, ct);
    return Ok(response);
}

[HttpPost]
public async Task<IActionResult> Post(AddWeatherForecastCommand command, CancellationToken ct = default)
{
    var response = await _mediator.Send(command, ct);
    return Ok(response);
}
Enter fullscreen mode Exit fullscreen mode

The add weather forecast command handler creates a new instance of the WeatherForecast entity and calls the repository to store in the database.

public async Task<AddWeatherForecastResponse> Handle(AddWeatherForecastCommand request,
    CancellationToken cancellationToken)
{
    var id = Guid.NewGuid();

    await _weatherForecastsRepository.AddWeatherForecast(new WeatherForecast(id,
        request.City, request.ForecastDate, request.Forecast), cancellationToken);

    return new AddWeatherForecastResponse(id);
}
Enter fullscreen mode Exit fullscreen mode

While the get weather forecast query handler retrieves the forecast from the database by calling the repository, calling a NotFoundException when it doesn't exist in the database.

public async Task<GetWeatherForecastResponse> Handle(GetWeatherForecastQuery request,
    CancellationToken cancellationToken)
{
    var weatherForecast = await _weatherForecastsRepository.GetWeatherForecast(request.City, request.ForecastDate, cancellationToken);

    if (weatherForecast == null) throw new NotFoundException();

    return new GetWeatherForecastResponse(weatherForecast.Id, weatherForecast.City, weatherForecast.ForecastDate.UtcDateTime, weatherForecast.Forecast);
}
Enter fullscreen mode Exit fullscreen mode

The NotFoundException is mapped to a problem details HTTP 404 error using the .NET library Hellang.Middleware.ProblemDetails.

services.AddProblemDetails(opts =>
{
    var showExceptionDetails = Configuration["Settings:ShowExceptionDetails"].Equals("true", StringComparison.InvariantCultureIgnoreCase);
    opts.ShouldLogUnhandledException = (ctx, ex, pb) => showExceptionDetails;
    opts.IncludeExceptionDetails = (ctx, ex) => showExceptionDetails;
    opts.MapToStatusCode<NotFoundException>(StatusCodes.Status404NotFound);
});
Enter fullscreen mode Exit fullscreen mode

Entity Framework Core is used for data access within the API to read and write data to the Postgres database. While running locally and in integration tests it uses a Postgres Docker container to host the database, while using GCP Cloud SQL when it is running in GCP Cloud Run. GCP Secrets Manager is used to store the database password and is therefore retrieved when the application starts if it is running in Cloud Run.

private static void ConfigureDbContext(IServiceCollection services, IConfiguration configuration)
{
    var dbSettings = configuration.GetSection("WeatherDb");
    var dbSocketDir = dbSettings["SocketPath"];
    var instanceConnectionName = dbSettings["InstanceConnectionName"];
    var databasePasswordSecret = GetDatabasePasswordSecret(dbSettings);
    var connectionString = new NpgsqlConnectionStringBuilder
    {
        Host = !string.IsNullOrEmpty(dbSocketDir)
            ? $"{dbSocketDir}/{instanceConnectionName}"
            : dbSettings["Host"],
        Username = dbSettings["User"],
        Password = databasePasswordSecret,
        Database = dbSettings["Name"],
        SslMode = SslMode.Disable,
        Pooling = true
    };

    services.AddDbContext<WeatherContext>(options =>
        options
            .UseNpgsql(connectionString.ToString())
            .UseSnakeCaseNamingConvention());
}

private static string GetDatabasePasswordSecret(IConfiguration dbSettings)
{
    var googleProject = Environment.GetEnvironmentVariable("GOOGLE_PROJECT");

    if (string.IsNullOrEmpty(googleProject)) return dbSettings["Password"];

    var dbPasswordSecretId = dbSettings["PasswordSecretId"];
    var dbPasswordSecretVersion = dbSettings["PasswordSecretVersion"];

    var client = SecretManagerServiceClient.Create();

    var secretVersionName = new SecretVersionName(googleProject, dbPasswordSecretId, dbPasswordSecretVersion);

    var result = client.AccessSecretVersion(secretVersionName);

    return result.Payload.Data.ToStringUtf8();
}
Enter fullscreen mode Exit fullscreen mode

To test the API all application code is covered by unit tests, with higher level integration tests designed to ensure the API functions correctly when it is running. The integration tests utilise the ASP.NET testing library to run the API in an in-memory server communicating with a Postgres Docker container to store and retrieve the weather forecasts.

protected override void ConfigureWebHost(IWebHostBuilder builder)
{
    if (builder == null) throw new ArgumentNullException(nameof(builder));
    builder.ConfigureServices(services =>
    {
        var sp = services.BuildServiceProvider();

        using var scope = sp.CreateScope();

        var scopedServices = scope.ServiceProvider;
        var db = scopedServices.GetRequiredService<WeatherContext>();
        var logger = scopedServices.GetRequiredService<ILogger<WeatherApiWebApplicationFactory<TStartup>>>();

        db.Database.EnsureCreated();

        InitializeDbForTests(db);
    });
}

private static void InitializeDbForTests(WeatherContext db)
{
    db.WeatherForecasts.RemoveRange(db.WeatherForecasts);

    db.SaveChanges();

    db.WeatherForecasts.Add(new WeatherForecast(Guid.NewGuid(), "Australia/Melbourne",
        new DateTimeOffset(2020, 01, 02, 0, 0, 0, TimeSpan.Zero), 23.35m));

    db.SaveChanges();
}
Enter fullscreen mode Exit fullscreen mode
[Fact]
public async Task GetReturnsCorrectWeather()
{
    const string city = "Australia/Melbourne";
    const string forecastDate = "2020-01-02T00:00:00+00:00";
    var url = QueryHelpers.AddQueryString("/api/weather-forecasts", new Dictionary<string, string>
    {
        {"city", city},
        {"forecastDate", forecastDate}
    });

    using var client = _factory.CreateClient();
    using var response = await client.GetAsync(new Uri(url, UriKind.Relative));
    var responseContent = await response.Content.ReadAsStringAsync();
    var responseObj = JsonSerializer.Deserialize<object>(responseContent) as JsonElement?;

    Assert.Equal(city, responseObj?.GetProperty("city").ToString());
    Assert.Equal(forecastDate, responseObj?.GetProperty("forecastDate").ToString());
    Assert.Equal(23.35m,
        decimal.Parse(responseObj?.GetProperty("forecast").ToString()!, CultureInfo.InvariantCulture));
}
Enter fullscreen mode Exit fullscreen mode

What could come next?

There were several other things that I planned to implement that would generally be required to deploy an application to production safely and ensure it continued to run correctly.

Currently the application is only deployed to a single environment, ideally there should be at least one other environment that it is deployed to, ensuring it works there before deploying to production.

While GCP Cloud Run aggregates the logs from the running container, it does not automatically enable distributed tracing and alerting which will be needed to ensure the application is running as expected in production.

There is currently no automated linting tool used for the C# code, however ReSharper code clean up was used for each commit manually to ensure consistent formatting. The ReSharper CLI or dotnet format could be used as a pre-commit hook to automatically lint the modified code.

Once the API is deployed automated functional tests can be written to ensure that it continues to function correctly in a deployed environment communicating with a real database. Automated security scanning can also be performed against the running API using tools such as OWASP ZAP, to detect any potential runtime vulnerabilities.

The API allows any user to retrieve and create weather forecasts which is far from ideal and also why I manually turn off the database when I am not using it. The API should authenticate users and then ensure they are authorized to either retrieve or create weather forecasts.

The database should be configured to automatically backup on a regular cadence to enable recovery of the data if data is lost. There should also be a separate user for deploying schema migrations and for accessing the database via the API to ensure a compromised application does not result in unintended schema changes.

The deployment pipeline takes roughly 10 minutes to build, test and deploy the application which can definitely be reduced by a number of optimisations such as; caching third party dependencies between steps and pre-building slim purpose built Docker images for each step.

Although the API code and the Dockerfile currently has security scanning, the bash scripts that are used to automate the various steps don't, so a tool such as Shellshock could be used to perform that scanning. The Pulumi TypeScript code also does not have any linting or security scanning.

The API request don't currently perform any validation other than what is built into ASP.NET Core and could therefore use a library such as FluentValidation to perform that validation to ensure the requests are as expected.

I am sure there are more ways I could improve this application to make it more suitable for production workloads and I intend to continue to iterate on it, sharing my learnings as I continue down this road. If anyone reading this post has any ideas to improve this further I am keen to hear them!

Try it yourself

All of the code in this post is part of a sample application that I have hosted as a public repository on GitHub which you can access here. The readme contains steps on how to deploy it to your own GCP project if you would like to try it out. Look forward to hearing your thoughts/feedback in the comments below.

Top comments (0)