DEV Community

Cover image for Dear LINQ, I love you, but I'm going back to my roots

Dear LINQ, I love you, but I'm going back to my roots

Matthew Watkins on December 19, 2017

I faced an interesting situation today. I was getting a collection of objects from a database using a stored proc, but I only needed some of the ...
Collapse
 
rafalpienkowski profile image
Rafal Pienkowski • Edited

You should always be careful when you're using LINQ.

There is huge difference between using LINQ on collection which are implementing IEnumerable<T> and IQueryable<T>. Long story short. When we're using IEnumerable interface, as you've written, we're working on objects which are stored in memory. In other hand IQueryable uses expression trees which can be modified as long as we don't execute them (calling ToList() or ToArray() method). More about expression trees.

Common mistake is work on IEnumerable interface. Let's take an example. We've a repository which contains method GetAll() which returns IEnumerable. When we'll work with those implementation like in example below, we'll first load whole collection into memory, then we'll filter results, after that we'll skip 10 elements and at the end we'll take only 5 results.

var elements = _repository.GetAll().Where(a => a.Salary < 100).Skip(10).Take(5);

But if our repository would implement method GetAll() which would return IQueryable expression won't be executed until we explicity call ToList() method. For instance, when we'll work with Entity Framework on a database our expression will be translated into proper SQL query and we'll return only 5 elements (or less if condition won't be full fit) to our program.

To sum up. Be aware during work with LINQ. Check if you're working on IEnumerable or IQueryable interfaces. It matters. Cheers.

BTW difference between IEnumerable and IQueryable is one of my favorite interview questions :)

Collapse
 
s3artis profile image
Matt Weingaertner • Edited

This comment sums the thoughts behind using LINQ up. 👍 Thank you!

Collapse
 
alexfomin profile image
Alex Fomin • Edited

LINQ allows you to focus on what you want to get rather than how you can get this. Compare two pieces of code:

 return items.Where(x => x % Divider == 0).ToList();

and

for (var i = 0; i < items.Count; i++)
{
  if (items[i] % Divider == 0)
  {
    items.RemoveAt(i);
    i--;
  }
}

return items;

It is almost takes nothing to understand first piece of code, while you need to get through five lines to understand second one. And as you have noted, the second one is error prone - do not forget to decrease i and so on.

As for performance and memory pressure, LINQ is far better than your approach. Here is the gist with benchmark:

And results are:

           Method |        Mean |      Error |     StdDev |    Gen 0 |    Gen 1 |   Gen 2 | Allocated |
----------------- |------------:|-----------:|-----------:|---------:|---------:|--------:|----------:|
 FilterEnumerable |    834.1 us |   8.983 us |   7.501 us | 427.7344 | 427.7344 | 71.2891 |  455.1 KB |
       FilterList | 42,794.5 us | 878.850 us | 822.077 us | 375.0000 | 375.0000 | 62.5000 | 390.73 KB |

LINQ is 50 times faster while memory allocation is almost the same. Well, you'll create a new List, right. But complexity of FilterEnumerable is O(N) while FilterList is O(N2).

As any other tool, LINQ is great in right hands. You need to know it pitfalls to use it more efficiently, but LINQ allows you not only write more efficient code, but gives you better tools for decomposition of your code as well.

Collapse
 
jfrankcarr profile image
Frank Carr

You said that you were getting a collection of objects from a database using a stored proc but you only needed some of the objects that were returned. To me, this indicates that you needed to refine that proc so that it would handle this for you via extra parameters or create a new proc that did it.

To me, one of the biggest problems with LINQ and, by extension Entity Framework, is that it encourages ignoring the power of the database engine, be it SQL Server, Oracle or something else. DB engines are specifically designed and tuned to extract and sort large amounts of data. Sure, you can do this on your middleware or client side using LINQ but why not use the DB's strength to its fullest?

I've found LINQ most useful when working with non-DB data (CSV, XML, etc) where there was no database engine in play or smaller sets of data that needed a quick and minor refinement, like sorting line items in a specific invoice.

Collapse
 
anotherdevblog profile image
Matthew Watkins

Correct. Unfortunately, modifying the proc (or writing a new proc) is not feasible for this. Heck, it's not even like EF where my filter criteria could be to some degree translated into SQL. I'm just stuck with this potentially massive list of objects and no say over which ones get returned for the foreseeable future.

Collapse
 
jfrankcarr profile image
Frank Carr

Turf wars on data sources can be a huge problem and create frustration and ugly hacks. I've run into this myself from time to time. Sometimes it's somewhat justified and other times it's just someone being a control freak.

Of course, the best solution is to get everyone functioning and cooperating on the same team/page but when you have dysfunctional management, this can be quite difficult.

Collapse
 
zspitz profile image
Zev Spitz

If I'm not mistaken, the additional memory used by a larger list, minus the objects referenced by said list, is quite minimal - perhaps nothing more than a stored count. A list with a count of one million, but all of the elements pointing to the same object, will consume no more memory than a list with a single element pointing to said object,

The main problem is the materialization of all those objects, to which you have a number of solutions:

  1. refine the stored procedure, or
  2. have your repository return IQueryable<T> backed by a database LINQ provider, on which you could use the Queryable extension methods which would be converted to SQL by the provider.

but obfuscating the intent of your code by avoiding LINQ and foreach seems rather pointless to me.

Update: I've just looked at the definition (.NET Framework) of List<T>, and none of the fields consume more memory as the size of the list grows (unless T is a value type).

Collapse
 
jeansberg profile image
Jens Genberg

Good article which inspires to look more closely at what LINQ calls do under the hood.

For removing items in a loop I like reversing the loop:


for (var i = list.Count - 1; i >= 0 ; i--)
...