DEV Community

loading...

IT Toolbox: CLR Stored Procs versus TSQL Stored Procs – Round 2

The Sharp Ninja
・11 min read

Updated

I've taken the advice from the comments on 8/3/2006 and updated the TSQL proc and made some performance modifications to the CLR proc. I've also tested with two seperate data sets to show how the procs scale in comparison to each other. You can find all the details in my comment from the night of 8/3/2006.


Original Content

Previously, I compared the performance of string concatenation over a small set using a cursor, a test that was nearly identical in performance between TSQL and CLR Stored Procedures (subsequent testing showed a 20% improvement when removing the cursor).  Today, I've compared the performance of building a temp table and returning a subset of that temp table.  The results are shocking.  You should never use TSQL to create a stored proc that builds a temp table.  Ever.

Before I get into the results, I want to go over the scenerio being tested and the code involved.  We're still using the Northwind database, but this time we've created a stored procedure that performs a SELECT against the Invoices view for a specific customer and date range, and returns back the invoices for that criteria and the running sales to that customer at the point in time for each row in Invoices.  To calculate the running sales, we must first start with the first sale to the customer, and loop through their invoices in order up through the last record to match the date range.  For this job, the TSQL stored proc requires a temp table to stage the data into and a cursor to loop over the temp table and calculate the running sales, then write that value into the temp table.  Finally, we perform a select on the temp table for the date range and drop the temp table.

  CREATE PROCEDURE dbo.RunningSales
      (
          @CustomerID NCHAR(5),
          @BeginDate DATETIME,
          @EndDate DATETIME
      )
  AS
      SET NOCOUNT ON

      DECLARE @RunningSales MONEY;
      DECLARE @Total MONEY;
      DECLARE @OrderID INT;
      DECLARE @ProductID INT;

      SET @RunningSales = 0.0

      SELECT  ShipName, ShipAddress, ShipCity, ShipRegion, ShipPostalCode,
          ShipCountry, CustomerID, CustomerName, Address, City, Region,
          PostalCode, Country, Salesperson, OrderID,
          OrderDate, RequiredDate, ShippedDate, ShipperName, ProductID,
          ProductName, UnitPrice, Quantity, Discount, ExtendedPrice, Freight,
          ExtendedPrice + Freight AS Total
          INTO #InvoicesTemp
      FROM     Invoices
      WHERE CustomerID = @CustomerID
      AND OrderDate <= @EndDate;

      ALTER TABLE #InvoicesTemp ADD RunningBalance MONEY NULL;

      DECLARE cur CURSOR  FOR
      SELECT OrderID, ProductID, Total FROM #InvoicesTemp
      ORDER BY OrderID, ProductID ASC;

      OPEN cur;

      FETCH cur INTO @OrderID, @ProductId, @Total;

      WHILE @@FETCH_STATUS = 0
      BEGIN
          SELECT @RunningSales = @RunningSales + @Total;

          UPDATE #InvoicesTemp
          SET RunningBalance = @RunningSales
          WHERE OrderID = @OrderID AND ProductID = @ProductID;

          FETCH cur INTO @OrderID, @ProductId, @Total;
      END

      CLOSE cur;

      DEALLOCATE cur;

      SELECT ShipName, ShipAddress, ShipCity, ShipRegion, ShipPostalCode,
          ShipCountry, CustomerID, CustomerName, Address, City, Region,
          PostalCode, Country, Salesperson, OrderID,
          OrderDate, RequiredDate, ShippedDate, ShipperName, ProductID,
          ProductName, UnitPrice, Quantity, Discount, ExtendedPrice, Freight,
          Total, RunningBalance
      FROM #InvoicesTemp
      WHERE OrderDate >= @BeginDate;

      DROP TABLE #InvoicesTemp;

      RETURN
Enter fullscreen mode Exit fullscreen mode

This may not be the most efficient way to do this, but I don't specialize in TSQL stored procs, so I'm always open for suggestions.

The corresponding C# code is very easy to follow.  We perform a select into a DataTable using a DataAdapter.  We then loop the results and calculate the Running Sales on the fly and update the current row in the DataTable at the same time.  Then we perform a Select() on the DataTable to apply the data range to get our final results.

  using System;
  using System.Collections.Generic;
  using System.Data;
  using System.Data.SqlClient;
  using System.Data.SqlTypes;
  using System.Text;
  using Microsoft.SqlServer;
  using Microsoft.SqlServer.Server;

  public partial class StoredProcedures
  {
    const string SELECT =
  @"SELECT
    ShipName, ShipAddress, ShipCity,
    ShipRegion, ShipPostalCode, ShipCountry,
    CustomerID, CustomerName, Address, City,
    Region, PostalCode, Country, Salesperson,
    OrderID, OrderDate, RequiredDate, ShippedDate,
    ShipperName, ProductID, ProductName, UnitPrice,
    Quantity, Discount, ExtendedPrice, Freight,
    ExtendedPrice + Freight AS Total
  FROM   Invoices
  WHERE CustomerID = @CustomerID
  AND OrderDate <= @EndDate";

    [Microsoft.SqlServer.Server.SqlProcedure]

    public static void RunningSalesCS(
      string customerID,
      DateTime beginDate,
      DateTime endDate)
    {
      decimal runningSales = 0.0M;
      DataTable table = new DataTable();
      SqlDataAdapter adapter = null;
      SqlCommand command = null;

      using (SqlConnection conn =
        new SqlConnection("context connection=true"))
      {
        command = new SqlCommand(SELECT, conn);
        command.Parameters.AddWithValue("@CustomerID", customerID);
        command.Parameters.AddWithValue("@EndDate", endDate);

        adapter = new SqlDataAdapter(command);

        conn.Open();

        int rows = 0;

        try
        {
          rows = adapter.Fill(table);
        }
        finally
        {
          conn.Close();
        }

        if (rows == 0) throw new ApplicationException("No rows returned.");

        table.Columns.Add(
            new DataColumn("RunningBalance",
            typeof(decimal)));

        foreach(DataRow row in table.Rows)
        {
          row["RunningBalance"] =
            (runningSales += (decimal)row\["Total"\]);
        }

        List<SqlMetaData> metaData =
          new List<SqlMetaData>(rows);

        foreach(DataColumn column in table.Columns)
        {
          SqlMetaData newMetaData = null;

          if(column.DataType.FullName == "System.String")
          {
              newMetaData = new SqlMetaData(
                column.ColumnName, SqlDbType.NVarChar, 4000);
          }
          else if(column.DataType.FullName == "System.Int32")
          {
              newMetaData = new SqlMetaData(
                column.ColumnName, SqlDbType.Int);
          }
          else if (column.DataType.FullName == "System.Int16")
          {
            newMetaData = new SqlMetaData(
              column.ColumnName, SqlDbType.SmallInt);
          }
          else if (column.DataType.FullName == "System.Int64")
          {
            newMetaData = new SqlMetaData(
              column.ColumnName, SqlDbType.BigInt);
          }
          else if (column.DataType.FullName == "System.DateTime")
          {
            newMetaData = new SqlMetaData(
              column.ColumnName, SqlDbType.DateTime);
          }
          else if (column.DataType.FullName == "System.Boolean")
          {
            newMetaData = new SqlMetaData(
              column.ColumnName, SqlDbType.Bit);
          }
          else if (column.DataType.FullName == "System.Decimal")
          {
            newMetaData = new SqlMetaData(
              column.ColumnName, SqlDbType.Money);
          }
          else if (column.DataType.FullName == "System.Single")
          {
            newMetaData = new SqlMetaData(
              column.ColumnName, SqlDbType.Real);
          }
          else
          {
            throw new ApplicationException(
              "Could not map " + column.DataType.FullName);
          }

          metaData.Add(newMetaData);
        }

        if (metaData.Count != table.Columns.Count)
        {
          throw new ApplicationException(
            "Not all columns were mapped to SqlMetaData.");
        }

        SqlMetaData[] array = (SqlMetaData[])(metaData.ToArray());

        if (array == null) throw new ApplicationException(
          "SqlMetaData array is null.");

        if (array.Length == 0) throw new ApplicationException(
          "No columns in SqlMetaData array.");

        bool isFirst = true;

        DataRow[] results = table.Select(

          "[OrderDate]>=#" + beginDate + "#");

        SqlContext.Pipe.Send(string.Format(
          "Returning {0} rows.", results.Length));

        foreach (DataRow row in results)
        {
          SqlDataRecord record = new SqlDataRecord(array);

          for(int i=0; i<record.FieldCount; i++)
          {
            record.SetValue(i, row[i]);
          }

          if(!isFirst)
          {
            SqlContext.Pipe.SendResultsRow(record);
          }
          else
          {
            SqlContext.Pipe.SendResultsStart(record);
            SqlContext.Pipe.SendResultsRow(record);
            isFirst = false;
          }
        }

        if(!isFirst)
        {
          SqlContext.Pipe.SendResultsEnd();
        }
      }
    }
  };
Enter fullscreen mode Exit fullscreen mode

The performance results are absolutely stunning, the CLR stored procedure is 14 times faster than the TSQL stored procedure:

Running Balance
1,000 Iterations – Cursor & Temp Table, Partial Result Select
Run TSQL CLR
1 00:54.9 00:03.9
2 00:55.0 00:03.9
3 00:55.0 00:03.9
4 00:55.0 00:04.0
5 00:55.3 00:04.0
6 00:55.0 00:03.9
7 00:55.1 00:04.0
8 00:54.7 00:03.9
9 00:54.8 00:03.9
10 00:55.0 00:03.9
Total 09:09.9 00:39.2
Std Dev 1.83E-06 4.38E-07
Ratio 1404% 7%

All testing was performed on the same instance of SQL Server as the previous entry, on the same database, with the same connection string.  As last time, the unit tests and data access component share all of the same code for executing and timing the stored procedures.

I'm willing to do more testing, but I think this pretty well proves the point that you can get much better performance from the CLR than from TSQL when you step beyond the simplest of stored procedures.  Even in the prior example, had the result set been larger we would have seen CLR code out-perform TSQL.  In this case, it's so overwhelmingly better than TSQL that I cannot recommend to anyone coding for SQL 2005 to use TSQL for stored procedures, even CRUD procs.  I've posted an entry on a design pattern for CLR based CRUD procs that makes the code much easier to templatize and maintain, gives far superior error handing, and can be seamlessly integrated and extended to further improve the family of stored procs relating to a table or view.

CLR Stored Procs versus TSQL Stored Procs - Round 1 <== Previously | Next Up ==> How not to create crappy systems.

Copyright

These articles are owned and copyrighted by IT Toolbox as I was a paid writer in their employ. Please do not copy them. If you cite them, please cite the originals.

Discussion (0)