parasmm

Posted on Nov 3, 2021 • Edited on Oct 13, 2023

Executing Prepared N1QL Queries in Couchbase using .NET Core

#couchbase #dotnet #csharp #tutorial

While interacting with Couchbase team for some of our query tuning activities, we were suggested by them to implement Prepared N1QL Queries in our environment. While searching online, I could not find much information on how to implement and use them with .NET Core. Thought this would be a good topic to share.

What is Prepared N1QL queries in Couchbase?

Every time we issue a N1QL query in Couchbase, the query nodes need to parse the query and create a plan before the query can be executed and results returned. In an application which executes a query 100s or 1000s of times with different parameters, this plan creation for every request is an overhead. To reduce this overhead in running such frequently used queries Couchbase gives an option to use prepared queries.

One option out of the box Couchbase provides is to set a query option named adhoc to false. In this case Couchbase internally creates an identifier and saves the plan. This is in no way bad, unless we come into a scenario where we have multiple microservices or multiple pods / containers running the same query. In that scenario the same query is stored with different identifiers.

The issue here is not that the identifier is different, but there is a limit on number of prepared statements which Couchbase server can handle. Quoting from Couchbase documentation - “For Couchbase Server 6.0 and earlier, the plan is cached by the SDK (up to a limit of 5000), as well as the Query Service. On Couchbase Server 6.5 and newer, the plan is stored by the Query Service — up to an adjustable limit of 16384 plans per Query node.”

Rather than relying on the internally generated identifier, we wanted to have an option to provide the name ourselves. This way we know which queries will be run frequently in our application and we can have it parsed and ready for Couchbase to execute faster.

Comparing these to MSSQL world, a view comes to my mind.

Example using Couchabse UI:

Using the travel-sample from Couchbase installation

PREPARE AirportInfoByCity FROM 
SELECT airportname, city, country, faa  
  FROM `travel-sample` t 
WHERE type = 'airport' 
  and t.city = $City

In the query above we create a prepared query which Couchbase parses and keeps ready for subsequent execution.

To execute the same, we run EXECUTE AirportInfoByCity

The parameter will need to be set in the Runtime preferences of Couchbase UI under Named Parameters.

Looking at the execution times for these queries may not look like a lot of difference but when it comes to complex and frequent queries every millisecond counts.

Basic Stats:
I am no expert in these statistics, but on my machine running the query without prepared averaged at about 25ms and with prepared at about 12 - 13ms. Giving an average saving of about 12ms. Now, taking this a step further, if I get about 20 requests per second this is a saving of 240ms per second and 14400ms (14.4 seconds) per min. These numbers add up fast considering number of requests coming in.

Expanding this Example to .NET Core:

To start with, we’ve created an AirportController with a GetByCityPrepared function as below -

[HttpGet]
[Route("byCityPrepared/{city}")]
public async Task<IList<Airport>> GetByCityPrepared(string city)
{
    try
    {
        KeyValuePair<string, object>[] parameters = new KeyValuePair<string, object>[1];
        parameters[0] = new KeyValuePair<string, object>("$City", city);
        var queryResult = await dbHelper.ExecutePreparedQueryAsync<Airport>("AirportInfoByCity", @"select airportname, city, country, faa  
                                                                         from `travel-sample` t 
                                                                         where type = 'airport' 
                                                                         and t.city = $City",
                                                               parameters);
        return queryResult;
    }
    catch (Exception ex)
    {
        throw ex;
    }
}

Note: Airport class is scaled down version of document type of airport in travel-sample bucket. Also, dbHelper object is injected using dependency injection. Not shown for keeping things concise.

And a common DBHelper class with function ExecutePreparedQueryAsync is defined as below -

public async Task<IList<EntityType>> ExecutePreparedQueryAsync<EntityType>(string QueryName, 
                                                                string Query,
                                                                KeyValuePair<string, object>[] parameters) 
{
    var options = new QueryOptions(); 
    try 
    { 
        if(parameters !=null && parameters.Any()) 
        { 
            options.Parameter(parameters); 
        } 
        var queryResult = await _cluster.QueryAsync<EntityType>($"EXECUTE {QueryName}",  
                                        options); 

        return await queryResult.Rows.ToListAsync<EntityType>(); 
    } 
    catch (Exception ex) 
    { 
        throw ex; 
    } 
}

This example assumes that the query has already been prepared. For any reason if the prepared query gets deleted or does not exist in Couchbase the application will start failing. It is possible that the prepared got created in only one of the query nodes; and when we try to run the prepared it does not find it in that (other) query node. The error that we got was -

To overcome this scenario, we would need to have a logic in place which in case the prepared query does not exists then it would create it and then run the prepared. This would help in subsequent execution to be faster.

Let us refactor our ExecutePreparedQueryAsync by moving the execution of prepared query to a new function called TryExecutePreparedQueryAsync. Also, lets create a function BuildPreparedQuery to create prepared query in case Prepared query does not exist.

Add TryExecutePreparedQueryAsync function:

private async Task<IList<EntityType>> TryExecutePreparedQueryAsync<EntityType>(string QueryName, 
                                            QueryOptions options) 
{ 
    var queryResult = await _cluster.QueryAsync<EntityType>($"EXECUTE {QueryName}",  
                                        options); 

    if(queryResult.MetaData.Status != QueryStatus.Success) 
    { 
        HandleQueryException(queryResult.Errors); 
    } 
    return await queryResult.Rows.ToListAsync(); 
}

Add BuildPreparedQuery function:

private async Task<bool> BuildPreparedQuery(string QueryName, string Query) 
{ 
    var prepareQuery = await _cluster.QueryAsync<dynamic>($"PREPARE {QueryName} FROM {Query}"); 

    if(prepareQuery.MetaData.Status == Couchbase.Query.QueryStatus.Success) 
    { 
        return true; 
    } 
    else 
    { 
        HandleQueryException(prepareQuery.Errors); 
        return false; 
    } 
}

Note: HandleQueryException function has not been shown here for keeping it concise. Implementation for the same can be based on your requirements.

Now, in case TryExecutePreparedQueryAsync fails, call BuildPreparedQuery first, to create the prepared query and then retry execution by calling TryExecutePreparedQueryAsync. Thus we modify ExecutePreparedQueryAsync function to use above 2 functions:

public async Task<IList<EntityType>> ExecutePreparedQueryAsync<EntityType>(string QueryName, 
                                                    string Query, 
                                                    KeyValuePair<string, object>[] parameters) 
{ 
    IList<EntityType> queryResult = null; 
    var options = new QueryOptions(); 
    try 
    { 
        if(parameters !=null && parameters.Any()) 
        { 
            options.Parameter(parameters); 
        } 
        queryResult = await TryExecutePreparedQueryAsync<EntityType>(QueryName, options); 
    } 
    catch (Exception ex) 
    { 
        // Prepare the query 
        var blnPrepareQuery = await BuildPreparedQuery(QueryName, Query);  
        if(blnPrepareQuery) 
        { 
            // Execute the prepared query 
            queryResult = await TryExecutePreparedQueryAsync<EntityType>(QueryName, options); 
        } 
    }
    return queryResult;  
}

At this point, the code works all fine. Except, there is a possibility that the prepared query exists in one of the query nodes but not in other. So, if our request goes to that other query node which does not have the prepared existing, then application would throw an error. For this reason, it is suggested to first delete the existing prepared, if any, and then create a new one and execute the prepared query.

To give a pseudo code for this -

1.Try to run the Prepared query 
    a. If successful, return results 
    b. If failed, goto step 2 
2. Delete any existing prepared query 
3. Create the query as a prepared statement 
4. Execute the prepared statement and return the result

Let us add a function to delete prepared query as below -

private async Task<bool> DeletePreparedQuery(string QueryName) 
{ 
    var deletePreparedQuery = await _cluster.QueryAsync<dynamic>($"DELETE FROM system:prepareds where name = {QueryName}"); 

    if(deletePreparedQuery.MetaData.Status == Couchbase.Query.QueryStatus.Success) 
    { 
        return true; 
    } 
    else 
    { 
        HandleQueryException(deletePreparedQuery.Errors); 
        return false; 
    } 
}

Then modify the ExecutePreparedQueryAsync as below -


// if prepared does not exists then delete any existing prepared on any other query nodes 
var blnDeletePrepare = await DeletePreparedQuery(QueryName); 
if(blnDeletePrepare) 
{ 
    // Prepare the query 
    var blnPrepareQuery = await BuildPreparedQuery(QueryName, Query);  
    if(blnPrepareQuery) 
    { 
        // Execute the prepared query 
        queryResult = await TryExecutePreparedQueryAsync<EntityType>(QueryName, options); 
    } 
}

In all above functions, _cluster object has been passed through constructor using couchbase dependency injection.

Output with City as Newark:

Bonus: The error that got thrown when prepared does not exist was a bit unclear one. Digging deeper, we came to know that Couchbase 3.2 SDK by default uses a retry strategy. This retry strategy tries to run the query 10 times and in case it is still not able to process then it would throw an error. If you look at the error message, it says Too many retries: 10.

In this scenario, in retry strategy code, we can get the info that the failure is due to Query prepared statement failure. But for that we need to provide a custom implementation of retry strategy code to QueryAsync function’s options.

if(parameters !=null && parameters.Any()) 
{ 
    options.Parameter(parameters); 
} 
options.RetryStrategy(new MyRetryStrategy()); 

queryResult = await TryExecutePreparedQueryAsync<EntityType>(QueryName, options);

And add the MyRetryStrategy based on -

https://github.com/couchbase/couchbase-net-client/blob/master/src/Couchbase/Core/Retry/BestEffortRetryStrategy.cs

The link is from Couchabse .Net SDK 3.2’s master branch. The link may not work if Couchbase development team moves the file. However, the snippet I added to manage the throwing of error instead of retrying it 10 times -

public RetryAction RetryAfter(IRequest request, RetryReason reason)
{
    if (reason == RetryReason.NoRetry)
    {
        return RetryAction.Duration(null);
    }
    // added below code to modify BestEffortRetryStrategy of 
    // couchbase to stop retrying in case of QueryPreparedStatementFailure
    else if(reason == RetryReason.QueryPreparedStatementFailure)
    {
        throw new System.Exception(reason.ToString());
    }
    if (request.Idempotent || reason.AllowsNonIdempotentRetries())
    {
        var backoffDuration = _backoffCalculator.CalculateBackoff(request);
        return RetryAction.Duration(backoffDuration);
    }

    return RetryAction.Duration(null);
}

Now, in our first catch block we can check if the error message is “QueryPreparedStatementFailure”. If so, perform recreation of prepared, else handle / throw the error based on your need.

Please refer github link for complete code sample.
<13-Oct-2023> Updated github code to refer to latest nuget versions

(The code sample includes code for CRUD operation in Couchbase using .NET Core web API)

In summary, having prepared queries can save precious milli seconds while executing frequently used queries in Couchbase. And Using .Net Core 3.1 and Couchbase SDK 3.2 we can have a near foolproof prepared query execution.

Applicable to: The code has been tested with below versions of the products.

.Net Core 3.1

Couchabse SDK 3.2

Reference:

https://docs.couchbase.com/java-sdk/current/concept-docs/n1ql-query.html#prepared-statements-for-query-optimization

https://github.com/couchbase/couchbase-net-client/blob/master/src/Couchbase/Core/Retry/BestEffortRetryStrategy.cs