I faced an interesting situation today. I was getting a collection of objects from a database using a stored proc, but I only needed some of the ...
For further actions, you may consider blocking this person and/or reporting abuse
You should always be careful when you're using LINQ.
There is huge difference between using LINQ on collection which are implementing
IEnumerable<T>
andIQueryable<T>
. Long story short. When we're using IEnumerable interface, as you've written, we're working on objects which are stored in memory. In other hand IQueryable uses expression trees which can be modified as long as we don't execute them (calling ToList() or ToArray() method). More about expression trees.Common mistake is work on IEnumerable interface. Let's take an example. We've a repository which contains method GetAll() which returns IEnumerable. When we'll work with those implementation like in example below, we'll first load whole collection into memory, then we'll filter results, after that we'll skip 10 elements and at the end we'll take only 5 results.
But if our repository would implement method GetAll() which would return IQueryable expression won't be executed until we explicity call ToList() method. For instance, when we'll work with Entity Framework on a database our expression will be translated into proper SQL query and we'll return only 5 elements (or less if condition won't be full fit) to our program.
To sum up. Be aware during work with LINQ. Check if you're working on IEnumerable or IQueryable interfaces. It matters. Cheers.
BTW difference between IEnumerable and IQueryable is one of my favorite interview questions :)
This comment sums the thoughts behind using LINQ up. 👍 Thank you!
LINQ allows you to focus on what you want to get rather than how you can get this. Compare two pieces of code:
and
It is almost takes nothing to understand first piece of code, while you need to get through five lines to understand second one. And as you have noted, the second one is error prone - do not forget to decrease
i
and so on.As for performance and memory pressure, LINQ is far better than your approach. Here is the gist with benchmark:
And results are:
LINQ is 50 times faster while memory allocation is almost the same. Well, you'll create a new List, right. But complexity of FilterEnumerable is O(N) while FilterList is O(N2).
As any other tool, LINQ is great in right hands. You need to know it pitfalls to use it more efficiently, but LINQ allows you not only write more efficient code, but gives you better tools for decomposition of your code as well.
You said that you were getting a collection of objects from a database using a stored proc but you only needed some of the objects that were returned. To me, this indicates that you needed to refine that proc so that it would handle this for you via extra parameters or create a new proc that did it.
To me, one of the biggest problems with LINQ and, by extension Entity Framework, is that it encourages ignoring the power of the database engine, be it SQL Server, Oracle or something else. DB engines are specifically designed and tuned to extract and sort large amounts of data. Sure, you can do this on your middleware or client side using LINQ but why not use the DB's strength to its fullest?
I've found LINQ most useful when working with non-DB data (CSV, XML, etc) where there was no database engine in play or smaller sets of data that needed a quick and minor refinement, like sorting line items in a specific invoice.
Correct. Unfortunately, modifying the proc (or writing a new proc) is not feasible for this. Heck, it's not even like EF where my filter criteria could be to some degree translated into SQL. I'm just stuck with this potentially massive list of objects and no say over which ones get returned for the foreseeable future.
Turf wars on data sources can be a huge problem and create frustration and ugly hacks. I've run into this myself from time to time. Sometimes it's somewhat justified and other times it's just someone being a control freak.
Of course, the best solution is to get everyone functioning and cooperating on the same team/page but when you have dysfunctional management, this can be quite difficult.
If I'm not mistaken, the additional memory used by a larger list, minus the objects referenced by said list, is quite minimal - perhaps nothing more than a stored count. A list with a count of one million, but all of the elements pointing to the same object, will consume no more memory than a list with a single element pointing to said object,
The main problem is the materialization of all those objects, to which you have a number of solutions:
IQueryable<T>
backed by a database LINQ provider, on which you could use theQueryable
extension methods which would be converted to SQL by the provider.but obfuscating the intent of your code by avoiding LINQ and foreach seems rather pointless to me.
Update: I've just looked at the definition (.NET Framework) of
List<T>
, and none of the fields consume more memory as the size of the list grows (unlessT
is a value type).Good article which inspires to look more closely at what LINQ calls do under the hood.
For removing items in a loop I like reversing the loop: