DEV Community

Chris White
Chris White

Posted on • Edited on

GitHub Actions Runner Deep Dive: Registration and Setup

When running code in GitHub Actions, one piece of information that is mostly abstracted away is how the code is being run. Everything from authentication to the actual job execution itself. Source code of the runner which orchestrates the running of code is available in a GitHub repository. It's primarily written in the C# programming language which is portable among several platforms through the openly available dotNET runtime. This article will look at what happens before all the log output you see on the GitHub Actions UI.

Execution Entry

Looking around the repository there's a run.sh and run.cmd which both are calling a run-helper.sh or run-helper.cmd in a loop format:

run() {
    # run the helper process which keep the listener alive
    while :;
    do
        cp -f "$DIR"/run-helper.sh.template "$DIR"/run-helper.sh
        "$DIR"/run-helper.sh $*
        returnCode=$?
        if [[ $returnCode -eq 2 ]]; then
            echo "Restarting runner..."
        else
            echo "Exiting runner..."
            exit 0
        fi
    done
}

Enter fullscreen mode Exit fullscreen mode
:launch_helper
copy "%~dp0run-helper.cmd.template" "%~dp0run-helper.cmd" /Y
call "%~dp0run-helper.cmd" %*

if %ERRORLEVEL% EQU 1 (
  echo "Restarting runner..."
  goto :launch_helper
) else (  
  echo "Exiting runner..."
  exit /b 0
)
Enter fullscreen mode Exit fullscreen mode

The helpers that are called are themselves template files that get renamed from run-helper.cmd.template and run-helper.sh.template. Both of these will run a Runner.Listener executable:

"$DIR"/bin/Runner.Listener run $*
Enter fullscreen mode Exit fullscreen mode
"%~dp0\bin\Runner.Listener.exe" run %*
Enter fullscreen mode Exit fullscreen mode

The Listener

The Runner.Listener has an entry point of Program.cs in the source code. At first entry an environment is loaded via this method:

private static void LoadAndSetEnv()
{
    var binDir = Path.GetDirectoryName(Assembly.GetEntryAssembly().Location);
    var rootDir = new DirectoryInfo(binDir).Parent.FullName;
    string envFile = Path.Combine(rootDir, ".env");
    if (File.Exists(envFile))
    {
        var envContents = File.ReadAllLines(envFile);
        foreach (var env in envContents)
        {
            if (!string.IsNullOrEmpty(env))
            {
                var separatorIndex = env.IndexOf('=');
                if (separatorIndex > 0)
                {
                    string envKey = env.Substring(0, separatorIndex);
                    string envValue = null;
                    if (env.Length > separatorIndex + 1)
                    {
                        envValue = env.Substring(separatorIndex + 1);
                    }

                    Environment.SetEnvironmentVariable(envKey, envValue);
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

This will load an .env file and attach anything in it to the program's environment. You can see generation of such a file via env.sh which pulls in several system and language environment variables. Once that's done there are various system level checks and command validation. This will then pass off to a runner using an asynchronous (which a majority of the codebase is) task:

IRunner runner = context.GetService<IRunner>();
try
{
    var returnCode = await runner.ExecuteCommand(command);
    trace.Info($"Runner execution has finished with return code {returnCode}");
    return returnCode;
}
Enter fullscreen mode Exit fullscreen mode

Runner Configuration

The runner itself lives in Runner.cs. The actual runner command has several subcommands:

  • Help: Print usage information
  • Version: Print runner version information
  • Commit: Print commit hash the runner was compiled from
  • Check: Ensures GitHub connectivity
  • Configure: Configure the GitHub runner for first time use
  • Remove: Essentially a runner uninstall
  • Run: Execute the runner

Now looking at the code the runner won't even initiate if it hasn't been configured:

if (command.Run) // this line is current break machine provisioner.
{
    // Error if runner not configured.
    if (!configManager.IsConfigured())
    {
        _term.WriteError("Runner is not configured.");
        PrintUsage(command);
        return Constants.Runner.ReturnCode.TerminatedError;
    }
Enter fullscreen mode Exit fullscreen mode

The configuration itself is referenced as a configManager:

try
{
    await configManager.ConfigureAsync(command);
    return Constants.Runner.ReturnCode.Success;
}
Enter fullscreen mode Exit fullscreen mode

The definition of configManager exists in the file ConfigurationManager.cs. This file contains various configuration options to setup along with some fancy ascii art:

            _term.WriteLine();
            _term.WriteLine("--------------------------------------------------------------------------------");
            _term.WriteLine("|        ____ _ _   _   _       _          _        _   _                      |");
            _term.WriteLine("|       / ___(_) |_| | | |_   _| |__      / \\   ___| |_(_) ___  _ __  ___      |");
            _term.WriteLine("|      | |  _| | __| |_| | | | | '_ \\    / _ \\ / __| __| |/ _ \\| '_ \\/ __|     |");
            _term.WriteLine("|      | |_| | | |_|  _  | |_| | |_) |  / ___ \\ (__| |_| | (_) | | | \\__ \\     |");
            _term.WriteLine("|       \\____|_|\\__|_| |_|\\__,_|_.__/  /_/   \\_\\___|\\__|_|\\___/|_| |_|___/     |");
            _term.WriteLine("|                                                                              |");
            _term.Write("|                       ");
            _term.Write("Self-hosted runner registration", ConsoleColor.Cyan);
            _term.WriteLine("                        |");
            _term.WriteLine("|                                                                              |");
            _term.WriteLine("--------------------------------------------------------------------------------");
Enter fullscreen mode Exit fullscreen mode

The core configuration occurs with the runner registration:

else
{
    runnerSettings.GitHubUrl = inputUrl;
    registerToken = await GetRunnerTokenAsync(command, inputUrl, "registration");
    GitHubAuthResult authResult = await GetTenantCredential(inputUrl, registerToken, Constants.RunnerEvent.Register);
    runnerSettings.ServerUrl = authResult.TenantUrl;
    runnerSettings.UseV2Flow = authResult.UseV2Flow;
    Trace.Info($"Using V2 flow: {runnerSettings.UseV2Flow}");
    creds = authResult.ToVssCredentials();
    Trace.Info("cred retrieved via GitHub auth");
}
Enter fullscreen mode Exit fullscreen mode

GetRunnerTokenAsync will either use a GitHub Personal Access Token or the value of the --token argument to generate a token used during runner registration:

private async Task<string> GetRunnerTokenAsync(CommandSettings command, string githubUrl, string tokenType)
{
    var githubPAT = command.GetGitHubPersonalAccessToken();
    var runnerToken = string.Empty;
    if (!string.IsNullOrEmpty(githubPAT))
    {
        Trace.Info($"Retriving runner {tokenType} token using GitHub PAT.");
        var jitToken = await GetJITRunnerTokenAsync(githubUrl, githubPAT, tokenType);
        Trace.Info($"Retrived runner {tokenType} token is good to {jitToken.ExpiresAt}.");
        HostContext.SecretMasker.AddValue(jitToken.Token);
        runnerToken = jitToken.Token;
    }

    if (string.IsNullOrEmpty(runnerToken))
    {
        if (string.Equals("registration", tokenType, StringComparison.OrdinalIgnoreCase))
        {
            runnerToken = command.GetRunnerRegisterToken();
        }
        else
        {
            runnerToken = command.GetRunnerDeletionToken();
        }
    }

    return runnerToken;
}
Enter fullscreen mode Exit fullscreen mode

After that GetTenantCredential will register the runner to the appropriate URL depending on if it's public GitHub or GitHub Hosted:

if (UrlUtil.IsHostedServer(gitHubUrlBuilder))
{
    githubApiUrl = $"{gitHubUrlBuilder.Scheme}://api.{gitHubUrlBuilder.Host}/actions/runner-registration";
}
else
{
    githubApiUrl = $"{gitHubUrlBuilder.Scheme}://{gitHubUrlBuilder.Host}/api/v3/actions/runner-registration";
}
Enter fullscreen mode Exit fullscreen mode

Then the credentials are finally converted to an OAuth2 format via CredentialManager.cs:

public VssCredentials ToVssCredentials()
{
    ArgUtil.NotNullOrEmpty(TokenSchema, nameof(TokenSchema));
    ArgUtil.NotNullOrEmpty(Token, nameof(Token));

    if (string.Equals(TokenSchema, "OAuthAccessToken", StringComparison.OrdinalIgnoreCase))
    {
        return new VssCredentials(new VssOAuthAccessTokenCredential(Token), CredentialPromptType.DoNotPrompt);
    }
    else
    {
        throw new NotSupportedException($"Not supported token schema: {TokenSchema}");
    }
}
Enter fullscreen mode Exit fullscreen mode

As GitHub utilizes Azure services, credentials are converted to Vss format for future authorization. Once this is done the connection is validated with our new credentials:

// Validate can connect.
await _runnerServer.ConnectAsync(new Uri(runnerSettings.ServerUrl), creds);
Enter fullscreen mode Exit fullscreen mode

During the registration process an RSA key will be created as well:

RSAParameters publicKey;
var keyManager = HostContext.GetService<IRSAKeyManager>();
string publicKeyXML;
using (var rsa = keyManager.CreateKey())
{
    publicKey = rsa.ExportParameters(false);
    publicKeyXML = rsa.ToXmlString(includePrivateParameters: false);
}
Enter fullscreen mode Exit fullscreen mode

This is used for both encryption and decryption of certain GitHub Actions messages (on both client and server side). Finally the agent can actually be added via RunnerDotcomServer.cs:

var gitHubUrlBuilder = new UriBuilder(githubUrl);
var path = gitHubUrlBuilder.Path.Split('/', '\\', StringSplitOptions.RemoveEmptyEntries);
string githubApiUrl;
if (UrlUtil.IsHostedServer(gitHubUrlBuilder))
{
    githubApiUrl = $"{gitHubUrlBuilder.Scheme}://api.{gitHubUrlBuilder.Host}/actions/runners/register";
}
else
{
    githubApiUrl = $"{gitHubUrlBuilder.Scheme}://{gitHubUrlBuilder.Host}/api/v3/actions/runners/register";
}

var bodyObject = new Dictionary<string, Object>()
        {
            {"url", githubUrl},
            {"group_id", runnerGroupId},
            {"name", agent.Name},
            {"version", agent.Version},
            {"updates_disabled", agent.DisableUpdate},
            {"ephemeral", agent.Ephemeral},
            {"labels", agent.Labels},
            {"public_key", publicKey},
        };
Enter fullscreen mode Exit fullscreen mode

The reason why there are two registrations is that the first ensures proper org/repository access and this actually registers the runner to the appropriate location.

Message Listener

Once the configuration has been completed the runner is now able to execute. This is handled via the a RunAsync call. The first notable thing this does is create a listener for messages from GitHub:

Trace.Info(nameof(RunAsync));
_listener = GetMesageListener(settings);
if (!await _listener.CreateSessionAsync(HostContext.RunnerShutdownToken))
{
    return Constants.Runner.ReturnCode.TerminatedError;
}
Enter fullscreen mode Exit fullscreen mode

GetMesageListener (No that's not a typo) obtains a message listener to utilize for this purpose:

private IMessageListener GetMesageListener(RunnerSettings settings)
{
    if (settings.UseV2Flow)
    {
        Trace.Info($"Using BrokerMessageListener");
        var brokerListener = new BrokerMessageListener();
        brokerListener.Initialize(HostContext);
        return brokerListener;
    }

    return HostContext.GetService<IMessageListener>();
}
Enter fullscreen mode Exit fullscreen mode

In testing I found the UseV2Flow isn't reached for my setup and code from MessageListener.cs was called. Once initialized the MessageListener will create a session with GitHub via the settings it obtained during registration. These settings look something like this:

{
  "AgentId": 2,
  "AgentName": "MyAgent",
  "PoolId": 1,
  "PoolName": "Default",
  "ServerUrl": "https://pipelinesghubeus3.actions.githubusercontent.com/[randomstring]/",
  "GitHubUrl": "https://github.com/org/repo",
  "WorkFolder": "_work"
}
Enter fullscreen mode Exit fullscreen mode

The URL itself is specialized for GitHub actions and not part of the standard GitHub API URL structure. After loading credentials to authenticate session information is created:

var agent = new TaskAgentReference
{
    Id = _settings.AgentId,
    Name = _settings.AgentName,
    Version = BuildConstants.RunnerPackage.Version,
    OSDescription = RuntimeInformation.OSDescription,
};
string sessionName = $"{Environment.MachineName ?? "RUNNER"}";
var taskAgentSession = new TaskAgentSession(sessionName, agent);
Enter fullscreen mode Exit fullscreen mode

This gives some basic version and OS information for the GitHub API to identify the agent. Now an initial connection needs to be established with the server indicated by ServerUrl in the settings:

await _runnerServer.ConnectAsync(new Uri(serverUrl), creds);
Enter fullscreen mode Exit fullscreen mode

Once the connection has been established a session can be created:

public virtual Task<TaskAgentSession> CreateAgentSessionAsync(
    int poolId,
    TaskAgentSession session,
    object userState = null,
    CancellationToken cancellationToken = default)
{
    HttpMethod httpMethod = new HttpMethod("POST");
    Guid locationId = new Guid("134e239e-2df3-4794-a6f6-24f1f19ec8dc");
    object routeValues = new { poolId = poolId };
    HttpContent content = new ObjectContent<TaskAgentSession>(session, new VssJsonMediaTypeFormatter(true));

    return SendAsync<TaskAgentSession>(
        httpMethod,
        locationId,
        routeValues: routeValues,
        version: new ApiResourceVersion(5.1, 1),
        userState: userState,
        cancellationToken: cancellationToken,
        content: content);
}
Enter fullscreen mode Exit fullscreen mode

This simply takes the task session information and bundles it along with a few more attributes which then gets executed as an API call. The actual call itself is run off of SendAsync which itself calls a wrapped System.Net.Http HttpClient's SendAsync method. This call pattern occurs quite frequently for various calls to GitHub's APIs. Session creation will also return an encryptionKey as one of the values to decrypt later messages sent by the GitHub API. Once the session is established the runner retrieves the first message from the message listener. Before looking at this though I'd like to take a quick detour at an interesting piece of functionality: cancellation tokens. You can seen an example used with the message queue:

CancellationTokenSource messageQueueLoopTokenSource = CancellationTokenSource.CreateLinkedTokenSource(HostContext.RunnerShutdownToken);
Enter fullscreen mode Exit fullscreen mode

A CancellationTokenSource is a dotnet concept which essentially creates a stop flag for loop/thread type constructs:

while (!HostContext.RunnerShutdownToken.IsCancellationRequested)
Enter fullscreen mode Exit fullscreen mode

This means that every part of the message listener which references this token will stop execution when reaching the checked value if its Cancel() method is called. Utilizing this can help prevent an abnormal amount of exceptions bubbling up from underlying sources. CreateLinkedTokenSource() allows for linking cancellation to other cancellation tokens. That means if a cancellation initiates at the system shutdown level it will also cancel message listener related executions using the token. This token feature is especially useful in the case where a user cancels a GitHub Action workflow/job which means everything on the runner side related to it needs to shut down.

Once the queue is entered the runner receives a message from the message listener via a call to GetNextMessageAsync. The message itself is encrypted via a combination of AES and RSA with session scope values. RSA key in this case is the one generated and registered with GitHub during the runner configuration process. If you wish to dive into it further, I put together a GitHub Actions Message Decrypt Tool which can be used to decrypt message contents from network traffic analysis. The basic format of this message is:

  • fileTable: list of relevant files, generally pointing to workflow YAML files
  • mask: values to mask from log outputs based on regex
  • steps: action steps for the job
  • variables: system information, along with features such as GITHUB_TOKEN value and permissions
  • messageType: the type of message, with PipelineAgentJobRequest being the one that indicates a job needs to be run
  • plan: related to task orchestration such as console/job status updates
  • timeline: timeline of task status transitions
  • jobId: ID of the job
  • jobDisplayName: the name of the job as it's defined under the jobs toplevel key
  • jobName: name of the job as declared as name: under each toplevel job key, set to __default if there isn't one defined
  • requestId: job request ID, used in a later call
  • lockedUntil: doesn't really have any meaningful value
  • resources: various service endpoints and related information
  • contextData: various context objects including github, inputs, vars, needs, strategy, matrix, and some feature flags if available

The full data structure of a message can also be found in AgentJobRequestMessage.cs. Now that the message is received it's time to handle the job. Note that each entry under jobs: in the workflow YAML receives its own unique message and is handled separately. The message enters the job dispatch workflow as a Run() method invocation:

Trace.Info($"Received job message of length {message.Body.Length} from service, with hash '{IOUtil.GetSha256Hash(message.Body)}'");
var jobMessage = StringUtil.ConvertFromJson<Pipelines.AgentJobRequestMessage>(message.Body);
jobDispatcher.Run(jobMessage, runOnce);
Enter fullscreen mode Exit fullscreen mode

Job Worker Dispatch

Job dispatch is handled as part of JobDispatcher.cs. The Run method referenced actually trickles down the chain to end up at RunAsync. ProcessChannel.cs creates a bi-directional channel between the runner and child worker process. Then the worker process is executed:

string workerFileName = Path.Combine(assemblyDirectory, _workerProcessName);
workerProcessTask = processInvoker.ExecuteAsync(
    workingDirectory: assemblyDirectory,
    fileName: workerFileName,
    arguments: "spawnclient " + pipeHandleOut + " " + pipeHandleIn,
    environment: null,
    requireExitCodeZero: false,
    outputEncoding: null,
    killProcessOnCancel: true,
    redirectStandardIn: null,
    inheritConsoleHandler: false,
    keepStandardInOpen: false,
    highPriorityProcess: true,
    cancellationToken: workerProcessCancelTokenSource.Token);
Enter fullscreen mode Exit fullscreen mode

The declaration for _workerProcessName can be found at the start of the class definition as:

private static readonly string _workerProcessName = $"Runner.Worker{IOUtil.ExeExtension}";
Enter fullscreen mode Exit fullscreen mode

Which will call Runner.Worker on *NIX systems and Runner.Worker.exe on Windows based systems. The dispatcher will then send off the job details to the worker for processing:

Trace.Info($"Send job request message to worker for job {message.JobId}.");
HostContext.WritePerfCounter($"RunnerSendingJobToWorker_{message.JobId}");
using (var csSendJobRequest = new CancellationTokenSource(_channelTimeout))
{
    await processChannel.SendAsync(
        messageType: MessageType.NewJobRequest,
        body: JsonUtility.ToString(message),
        cancellationToken: csSendJobRequest.Token);
}
Enter fullscreen mode Exit fullscreen mode

Now it's time for the public facing part of a GitHub action runner: the Runner Worker. Being a program the main entry point sits in Program.cs. After some argument and environment validation the worker process starts:

 // Run the worker.
return await worker.RunAsync(
    pipeIn: args[1],
    pipeOut: args[2]);
Enter fullscreen mode Exit fullscreen mode

The Worker will wait for the message from the job dispatcher before proceeding:

 channel.StartClient(pipeIn, pipeOut);

// Wait for up to 30 seconds for a message from the channel.
HostContext.WritePerfCounter("WorkerWaitingForJobMessage");
Trace.Info("Waiting to receive the job message from the channel.");
WorkerMessage channelMessage;
using (var csChannelMessage = new CancellationTokenSource(_workerStartTimeout))
{
    channelMessage = await channel.ReceiveAsync(csChannelMessage.Token);
}
Enter fullscreen mode Exit fullscreen mode

This message is the unencrypted form of the message listener message with slight modification containing all details needed for the job. Then the job dispatcher will finally send it off to the job runner:

 Task<TaskResult> jobRunnerTask = jobRunner.RunAsync(jobMessage, jobRequestCancellationToken.Token);
Enter fullscreen mode Exit fullscreen mode

As with all tasks it has a cancellation token presented for handling exceptions and job cancellation requests. From here on is where all the public facing GitHub Actions output happens and the subject of the next installment in the series.

Conclusion

I must say this was a very enlightening experience. Seeing that the runner was developer in C# was a bit surprising. Between delegates and interfaces it did make for quite a substantial amount of back and forth between code files... I also found that GitHub Actions is essentially Azure Pipelines and the agent looks to be basically the GitHub Agent runner. Not surprising though since consolidating codebases to save time seems like a fairly reasonable approach. I was actually planning on making this a very long article but figured there were those who wanted to know everything about GitHub Actions execution and those who simply wanted to focus on how job execution is handled in terms of what you see in the UI. I hope you enjoyed this article and please look forward to the next part of this series where I look over how the actual jobs are executed.

Top comments (0)