DEV Community

Using C# LINQ API to perform SQL IN logic at WHERE clause

Alex Ruzenhack on August 30, 2018

I was working on a project with User and Costumer identificação. The first is generated by the web app, and the second by the business department. ...

Read full post

Sung M. Kim • Aug 30 '18

Thank Alex,
the translation looked easy to understand as the LINQ version matches with SQL statement 👍.

And I have a question regarding the conversion.

Would the LINQ version fetch all data from database first and filter (using Where clause) in memory?

would the query create an optimal syntax to send to the database and fetch only filtered records?

Bob van Hoove • Aug 30 '18 • Edited

Calling ToList() results in execution of the query. So the first statement will make an in memory list of email addresses.

To get a better feel for what's happening you can intercept the actual queries. Assuming Entity framework is used: Log Queries executed by Entity Framework DbContext.

Dian Fay • Aug 30 '18 • Edited

I stopped writing C# right as LINQ got popular so I don't know it too well but I believe this does store the list of emails in memory.

In SQL you could do this:

SELECT c.*
FROM customers AS c
JOIN users AS u ON u.email = c.email
WHERE u.customer_id IS NULL

and achieve the same result without a subquery. I don't know how you'd go about translating that to LINQ though.

Rafal Pienkowski • Aug 30 '18

In my opinion, this is a better approach, which doesn't require an additional call to the database and doesn't require to store entities in the memory (which could have terrible results in case of a huge collection).
And of course, it is available in LINQ.

Alex Ruzenhack • Sep 3 '18 • Edited

It's true! But the tables are in different contexts, which means they are in different databases. So, it's not possible to make a JOIN between them.

Dian Fay • Sep 3 '18

My condolences :)

Alex Ruzenhack • Sep 3 '18

Here is the EF SQL of "Users":

Microsoft.EntityFrameworkCore.Database.Command:Information: Executed DbCommand (110ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT [user].[Id], [user].[Email], [user].[CustomerId], [user].[UserName]
FROM [AspNetUsers] AS [user]
WHERE [user].[CustomerId] IS NULL

And here is the EF SQL of "Customers":

Microsoft.EntityFrameworkCore.Database.Command:Information: Executed DbCommand (16ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT [t].[Id], [t].[Email], [t].[FullName], [t].[RegisteredAt],  [t].[CreatedAt], [t].[UpdatedAt]
FROM [Customers] AS [t]

As we can see, the IN logic really occurs in memory.

Sung M. Kim • Sep 3 '18

Thanks for the clarification Alex 😎

Frank Carr • Aug 30 '18

I prefer to use stored procedures and table variables for this kind of query when it gets complex or is working against a large amount of data. I prefer to let the database do the job its been optimized for, parsing and sorting data, and use C# and LINQ for stuff SQL isn't optimized to do well, like string manipulation, or to work with smaller amounts of data, like lookup tables.

Of course, using techniques like in this article are good when there's a "turf war" between DBA's and application developers or when the application team has limited SQL skills and a DBA isn't available to write optimal procs.