Erik

Posted on Sep 14, 2019 • Originally published at erikscode.space

How To Work With Memory Optimized Objects in SQL Server

#sql #sqlserver #database

Originally posted at http://erikscode.space/index.php/2019/07/20/how-to-work-with-memory-optimized-objects-in-sql-server/

Good morning everyone, today I’m going to show you how to make memory optimized tables and natively compiled stored procedures. First, I’ll talk about why you might want to use these things, then we’ll just dive straight in to how to do it.

Relevant Video

If you don’t feel like reading all of this, I made a video with the same scripts and same general information:

Why Memory Optimized Tables?

So why would you not want to use disk based tables and opt for the fancy in-memory ones? There is really only one answer: speed.

As I’ll demonstrate later, the in-memory tables are written to and selected from so much faster than disk based tables. This is to be expected, of course, as memory is always faster than retrieving data from a file. But why then shouldn’t every table be memory optimized?

Well, roughly speaking, disk based tables will be “safer.” First of all, you have no control over memory allocation, so who knows how much memory your server will set aside for its operations?

Also, your database has a lot less memory than it does disk space. You will have an overflow long before you run out of hard drive on your database server.

In my humblest of opinions, in-memory tables and natively compiled stored procedures are best used when a customer or user of your product has to sit through a long write or select process. I think it’s better to have these operations handled by the optimized tables, then offloaded into disk based storage in a place that doesn’t affect user experience.

I’m definitely no software or database architect though, so lets skip the theory and jump into how we make these things!

What we’re about to do:

We’re going to make a FlowerStore database with six tables. Actually, it’s really three tables, but we’re going to make disk based and memory based ones to compare. Here are the steps we’ll take —

Create the database and configure it to be able to use memory optimized objects
Create identical disk based and in-memory tables
Create identical views that will be used later to show the difference in querying speed
Creating the stored procedures – one normal, one natively compiled
Demonstrating the difference speed of the stored procedures by using SSMS
Demonstrating the difference in speed of querying by using jMeter

If you want my sample code:

For this demonstration, I’ve uploaded the scripts to GitHub:

https://github.com/erik-whiting/SQLServerTutorial-MemoryOptimization

You can either do a git pull or just download and unzip. We’re going to run the scripts in the order they appear in the repo, although a couple of them can be interchanged.

Creating the Database

First thing is first, we have to actually create our FlowerStore database and tell it to expect there to be memory optimized data. Take a look at 01-CreateDB.sql in the repository. Lines 1 – 21 are normal disk-based stuff, so lets look at lines 23 through 32:

-- Make Database compatible with memory optimized objects
ALTER DATABASE FlowerStore SET COMPATIBILITY_LEVEL = 130;
ALTER DATABASE FlowerStore SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT = ON;

-- Add filegroup for memory optimized data
ALTER DATABASE FlowerStore ADD FILEGROUP fs_mem CONTAINS MEMORY_OPTIMIZED_DATA;
ALTER DATABASE FlowerStore ADD FILE (
    NAME = fs_mem_file, 
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\fs_mem'
) TO FILEGROUP fs_mem

First thing we have to do is set the compatibility level of the database to 130. What does this mean? I have no idea, but this is what Microsoft says needs to happen.

Next, we set the isolation level of snapshot with lien 25. What this does is basically use an isolation level that will copy the data memory objects are using into tempdb. In any transaction, all the memory based objects will have access to the same copy of data, meaning any disk based rows are not locked.

Next, we add filegroup with the CONTAINS MEMORY_OPTIMIZED_DATA attribute. We have to add a filegroup with that attribute in order to have a data file that can contain that data, which we make on line 29.

Creating the Tables

In script 02, there is really nothing to learn because it is basic table building stuff. Please run it though, as we’ll be using the disk based tables to see how much faster the memory tables are.

Check out script 03, where we create the identical tables but in memory, there are a few things to note:

On line 6, we have to specify non-clustered index. This has to do with the fact that only index definitions are stored for in-memory tables, not the actual index pages (memory tables have no pages at all).
Each table has MEMORY_OPTIMIZED = ON and DURABILITY = SCHEMA_AND DATA. The memory optimized clause is kind of a no-brainer, this is basically how you say “don’t be a disk based table.” The durability part has to do with how the data is persisted.
- Schema refers to the structure of the table, like the column names.
- Data refers to the actual data that is inserted into these tables.
- The durability option tells SQL Server which one of these things to persist, but if you try to persist schema only in this script, the script will fail.

Creating the Views

The views in script 04 and 05 are going to be used to show how much faster the memory tables will query, and there really not much to learn from these scripts (unless you want to see how to make views).

Script 04 creates a view that totals line items in an order, and script 05 creates views that get the grand total for each order.

Creating the Stored Procedures

This is probably the most interesting section. If you can’t tell, both these stored procedures are going to take a number from the user and make up some fake data about sales and line items. There is really no actual point to these except for me to show you how much faster the writes will be.

Notice in the first stored procedure, it’s business as usual. We declare our variables and write our inserts. The natively compiled one is a little bit different.

The NATIVE_COMPILATION is what tells the database to save this as a stored procedure, easy enough. So what’s up with the SCHEMABINDING option?

This basically says that the tables this procedure uses cannot be dropped until the procedure is dropped, and you must set this to true for native compilation.

You also must use BEGIN ATOMIC after the AS statement. It must be used here, which is odd, because it cannot be used in non-compiled procedures. I’m not entirely sure what this flag does, but I do know it implies the BEGIN and COMMIT or ROLLBACK lines, so you don’t need to write them.

Again, you have to use ISOLATION LEVEL = SNAPSHOT because this is really the only isolation level you can use with in-memory objects.

Finally, you have to specify the language. See line 34 LANGUAGE = N’us-english’. This is another thing that is required, but I couldn’t really tell you why. For now, just know that you have to specify the language used when creating compiled stored procedures.

Lets Test it Out!

Ok! Let’s actually run these things and see what happens. First, let’s populate the database with some random data using the stored procedures. First up, disk based tables with the T-SQL interpreted procedure:

EXECUTE makeLotsOfRecords @Iterations = 100000

So we’re making 100,000 line items. Your architecture may vary, but when I ran this, it took 40 seconds to run.

Next up, the compiled procedure that writes to memory tabels:

EXECUTE makeLotsOfRecordsMEM @Iterations = 100000

Again, depending on how much juice your personal computer has, your results may vary, but when I ran this, it was done in less than a second. Wow!

Testing Query Speed

Because I was tired of staring at SSMS, I wrote the query test in an Apache JMeter test plan. I won’t get into how to set this up, as that’s beyond the scope of this article, but I’ll describe the results.

What I did was make 2 thread groups, one that queried the summing view for the disk based tables, and the other for the in-memory table querying view. Each group had 200 threads and looped 5 times, for 1000 queries each. Here are the summary results:

One thing I think is interesting is the standard deviation. If you don’t know, this is basically the variability of response times. For the memory tables, the standard deviation is a little more than one-third the average, whereas with the disk tables, the deviation is nearly equal to the average. Crazy.

Conclusion

Today we went over how to configure a database to use memory-optimized objects, how to create in-memory tables, and how to write natively compiled stored procedures. We then tested each thing out to see what kind of performance differences we got.

DEV Community