DEV Community

Cover image for Resource observability case study: jemalloc in Android builds
Iñaki Villar
Iñaki Villar

Posted on • Updated on

Resource observability case study: jemalloc in Android builds

As build engineers, one of our biggest concerns is running out of memory during our Gradle builds. This issue has a significant impact on our developers. When memory runs out, it can cause the system to kill the Gradle daemon, resulting in failed CI builds, frequent garbage collection (GC) overhead leading to slow build times, and, most importantly, undermining the team's confidence in running builds on CI.

We often spend time searching for magic formulas to configure the system optimally, but the reality is more complex, with multiple dimensions of problems that vary for each project:

  • Agent resources: Hardware always matters—not just CPU/memory, but also network and disk.
  • Type of build: For instance, executing costly test tasks in parallel or performing intensive R8 operations at the end of the build.
  • Nature of the build: Is the build dominated by cache hits, or do we have builds that consistently apply memory pressure? Are we providing layers of caching, like dependencies or wrappers, in our scenarios?
  • Project structure: Do we have a large legacy module with thousands of compilation units?

As you can see, it's hard to find the perfect formula. That's why I'm opinionated and always try to design for the worst-case scenario, ensuring reliability in CI builds when running all tasks on a clean agent. However, once that state is achieved, we still need to consider multiple cases to continue making improvements.

This underscores the importance of having the appropriate tools to monitor performance and automated tools to experiment under different scenarios. With these, we can systematically observe and analyze how different configurations behave, ensuring we make data-driven decisions.

Fortunately, Develocity 2024.2 introduces build resource usage observability, a feature that was previously missing in build scans:

Image description

Now, we have complete information on key metrics like memory usage, CPU, network, disk, and more. What's even better is the availability of new API endpoints that we can integrate with our monitoring systems to evaluate performance across these different metrics.

As a demonstration, I want to measure something that caught my attention months ago—an interesting topic brought up by Jason Pearson: the use of jemalloc as a native memory allocator for Android builds. The initial claim is that this usage brings a reduction in memory usage by optimizing how memory is allocated and deallocated. jemalloc is designed to minimize memory fragmentation and improve performance, particularly in multithreaded applications, making it ideal for resource-intensive builds.

Experiment

To explore the impact of different native memory allocators on Android build performance, we designed an experiment with two variants:

  • Default native memory allocator
  • jemalloc

We conducted the experiment using GitHub Actions, where we created two distinct Docker images based on amazoncorretto:17-al2023-jdk. For the jemalloc docker image variant, we configured the allocator by running the following commands:

RUN curl -L "https://github.com/jemalloc/jemalloc/releases/download/5.3.0/jemalloc-5.3.0.tar.bz2" -o jemalloc.tar.bz2
RUN tar -xf jemalloc.tar.bz2
RUN cd jemalloc-5.3.0/ && ./configure && make && make install
ENV LD_PRELOAD /usr/local/lib/libjemalloc.so
Enter fullscreen mode Exit fullscreen mode

Next, in the build.yaml file, we set up the configuration to test both variants:

strategy:
    matrix:
        variant: ["cdsap/android-builder:0.5", "cdsap/android-builder-jemalloc:0.5"]
        runs: ${{ fromJson(needs.iterations.outputs.iterations) }}
runs-on: ubuntu-latest
container:
    image: ${{ matrix.variant }}
Enter fullscreen mode Exit fullscreen mode

The experiment was executed with 100 iterations(fresh agent without cache) for each variant, using the nowinandroid project and focusing on the assembleRelease task. By comparing the results, we aimed to assess the effectiveness of jemalloc in reducing memory usage and improving build performance in a CI environment.

Metrics
To get a clear picture of the performance metrics during our experiment, we utilized one of the new endpoints provided by the Develocity API: api/builds/$buildScanId/gradle-resource-usage. This endpoint delivers detailed insights into various resource usage metrics throughout the build process.

{
    "totalMemory": 68719476736,
    "total": {
        "allProcessesCpu": {
            "max": 98,
            "average": 79,
            "median": 86,
            "p5": 40,
            "p25": 85,
            "p75": 89,
            "p95": 95
        },
        "buildProcessCpu": {},
        "buildChildProcessesCpu": {},
        "allProcessesMemory": {},
        "buildProcessMemory": {},
        "buildChildProcessesMemory": {},
        "diskReadThroughput": {},
        "diskWriteThroughput": {},
        "networkUploadThroughput": {},
        "networkDownloadThroughput": {}
    }
}
Enter fullscreen mode Exit fullscreen mode

One key insight from the metrics exposed by the new endpoint is that it provides not only memory metrics specific to the build process but also the total memory usage on the agent. This is perfect for the purpose of our experiment.

Using the tags added to each variant execution, we then pulled the data and aggregated the results from all 100 iterations for each version of the experiment.

Results

All processes:

Image description

  • The time-series data shows that jemalloc has generally lower and more stable memory usage over time compared to malloc.
  • There are fewer spikes in memory usage with jemalloc, indicating that it may provide a more consistent memory footprint.

Main build process:

Image description

  • For the main build process, jemalloc again shows a more stable and lower memory usage pattern compared to malloc.

  • malloc has higher peaks and more variability, which can be less desirable in a memory-constrained environment.

Summary results:

Image description

Final words
The results of this experiment are inherently influenced by the specific project and the environment in which the scenarios were executed. Nevertheless, it is evident that jemalloc offers slightly better performance in terms of memory usage.

The primary focus of this article is to highlight the new opportunities introduced with the latest release of Develocity 2024.2, particularly the enhanced build resource usage information now available in build scans and through the Develocity API. These new features provide deeper insights into memory usage, enabling more informed decision-making and optimization in your development workflows.

Happy building!

Top comments (0)