DEV Community

nofarb
nofarb

Posted on

A new approach to caching in continuous integration (CI)

In this post, we’ll introduce Cache Intelligence , an innovative solution to simplify caching in continuous integration (CI) pipelines by automatically caching and restoring known dependencies. With Cache Intelligence, offered in Harness Continuous Integration module, developers can optimize CI build times and improve the efficiency of software delivery.

Why Cache Intelligence?
CI builds often require downloading and installing large dependent software components, such as libraries, which can be very time-consuming. When building locally or reusing build machines for multiple builds, a cache is usually available to speed up the process. However, this is often not the case when using modern continuous integration solutions because they execute pipelines in ephemeral environments, such as containers or virtual machines, which are terminated after execution. This approach ensures that pipelines run in isolated environments, free from interference by other build instances, but the lack of a local cache can result in longer build times. As such, configuring a cache is recommended when using ephemeral build environments, but this task can be tedious.

The responsibility of incorporating this logic into pipelines, and managing the cache typically falls on the user. Even though caching can already be achieved in various ways — such as save and restore cache steps or mounting volumes — engineers are still responsible for the bulk of configuration and maintenance, taking time away from development. Many developers don’t even realize that caching is a solution to expedite their build times.

Harness Cache Intelligence simplifies the build process in CI pipelines by streamlining cache management and automatically caching and restoring known dependencies. We store the cache securely in Harness cloud, our fully hosted builds environment, freeing you from the hassle of providing your own storage. By removing the complexity of cache management, developers benefit from more streamlined workflows, allowing them to increase efficiency and productivity and focus on writing code.

Let’s demonstrate how users can get started with Cache Intelligence to see how this new feature significantly reduces build time.

Demonstrating the Impact of Cache Intelligence on Build Speed
In order to show how effective Cache Intelligence is in reducing build time, we’ve used the Harness Developer Hub (a web application built with Yarn) to conduct two consecutive builds.

In the first execution, the cache was unavailable, and the build step took 9m 5s, with the entire build process totaling 10m 56s.

In the second execution, the cache was restored and utilized, reducing the build step to 3m 10s and the overall build time to just 6m 1s.

The result is a 45% reduction in the total build duration, demonstrating the significant impact of Cache Intelligence on optimizing build performance.

build duration comparison

In the images below, you can see the difference in performance between individual steps when the cache is not present :

Cache is not available:

Cache is not available

Cache is available :
Cache is available

This comparison showcases the significant improvements in build speed and efficiency from Cache Intelligence.

Try it yourself — Setting Up and Using Cache Intelligence
When using Harness CI, you first need to add the “caching” property in your CI stage. We store the cache securely in Harness cloud, our hosted environment, freeing you from the hassle of providing your own storage.

- stage:
    name: Build Jhttp
    identifier: Build_Jhttp
    type: CI
    spec:
      caching:            # --------------- ADD LINE
        enabled: true     # ----------------ADD LINE
      cloneCodebase: true
Enter fullscreen mode Exit fullscreen mode

Currently, automatic dependency detection and caching supports Bazel, Maven, Gradle, Node, Yarn, and Go build tools. If you’re using a different build tool or have dependencies stored in non-standard locations, Cache Intelligence can still be used by specifying the paths you’d like to cache. For instance:

- stage:
    name: Build Jhttp
    identifier: Build_Jhttp
    type: CI
    spec:
      caching:          
        enabled: true   
        Paths:             # ------- Add 'Paths' under 'caching'
          - tmp/cache      # ------- Specify one or more paths to cache
      cloneCodebase: true
Enter fullscreen mode Exit fullscreen mode

Here’s a step-by-step example demonstrating how can you use Cache Intelligence:

  1. Fork the Harness Developer Hub GitHub repository https://github.com/harness/developer-hub into your account.
  2. Sign up for a free account (or login to an existing account)
  3. Follow the Get Started wizard in Harness CI.
  4. Configure connection to your Source Code Manager
  5. When you are prompted to select a Source Control Manager, select ‘GitHub’ and use either OAuth or Access Token as the authentication method that Harness CI will use to connect to your GitHub account.
  6. When you are prompted to select a repository, select the repository that you forked in the earlier step, and then select Configure Pipeline.
  7. Select ‘Create empty pipeline configuration’ and then select Create Pipeline.

After following these steps, you will have a basic ‘hello world’ pipeline that clones your forked repository and prints a welcome message. Modify the pipeline as following

Add the ‘caching’ property under the stage’s spec property

caching:          
  enabled: true 
Enter fullscreen mode Exit fullscreen mode

Replace the ‘Echo Welcome Message’ step with the following step that builds the application

type: Run
  name: Build
  identifier: Build
  spec:
    connectorRef: account.harnessImage
    image: node:18
    shell: Sh
    command: |-
      yarn
      yarn build
Enter fullscreen mode Exit fullscreen mode

At the end you will have a pipeline similar to the following:

 pipeline:
  name: Harness Developer Hub
  identifier: Harness_Developer_Hub
  projectIdentifier: playground
  orgIdentifier: default
  tags: {}
  properties:
    ci:
      codebase:
        connectorRef: Nofar_githubcom
        repoName: harness/developer-hub
        build: <+input>
  stages:
    - stage:
        name: Build
        identifier: Build
        type: CI
        spec:
          caching:
            enabled: true
          cloneCodebase: true
          platform:
            os: Linux
            arch: Amd64
          runtime:
            type: Cloud
            spec: {}
          execution:
            steps:
              - step:
                  type: Run
                  name: Build
                  identifier: Build
                  spec:
                    connectorRef: account.nofar_dockerhub
                    image: node:18
                    shell: Sh
                    command: |-
                      yarn
                      yarn build`

Enter fullscreen mode Exit fullscreen mode

Get Started with Harness Cache Intelligence
Harness’ Cache Intelligence presents an advanced approach to enhancing build performance in CI pipelines. Ready to start streamlining your cache management? Sign up for a free today, or request a demo!

Top comments (0)