Introduction
Learn how to use DistinctBy which was introduced with NET6 Core.
DistinctBy
returns unique items
from a list, determined by a key
(which can be one or more properties) specified through a selector function. This method is particularly beneficial when dealing with large sets of data, as DistinctBy can optimize performance by reducing the dataset to only unique items based on a particular property or properties.
DistinctBy receives a delegate to select the property or properties to use as the comparison key and returns the objects containing the distinct values.
There is a question on Stackoverflow that may have very well sparked the idea for DistinctBy. For those using older NET Frameworks check it out.
Microsoft Documentation remarks
This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach
in C# or For Each
in Visual Basic.
Side notes
Secondary learning, in the provided source code, check out GenericsExtensions.cs which has two useful language extension methods.
Index extension method provides deconstruction for a foreach
statement along with providing the item index.
foreach (var (index, member) in distinctList.Index())
{
Console.WriteLine($"{index,-7}{member.Id,-10}{member.FirstName,-10}{member.SurName}");
}
All code done with .NET Core 8 which uses Collection expressions.
Code samples
Movies release year
In this example, we are asked to get one movie for release year in a list of movies.
The model for movies.
/// <summary>
/// Represents a movie with properties for identification, name, release year, and rating.
/// </summary>
public class Movie
{
public int Id { get; set; }
public string Name { get; set; }
public int Released { get; set; }
public int Rating { get; set; }
public override string ToString() => Name;
}
List of movies which in a real application would comes from a database, json file or other data source.
public static IEnumerable<Movie> MovieList()
{
return new List<Movie>
{
new() { Id = 1, Name = "Inception", Released = 2010, Rating = 5 },
new() { Id = 2, Name = "The Matrix", Released = 1999, Rating = 5},
new() { Id = 3, Name = "Interstellar", Released = 2014, Rating = 5 },
new() { Id = 4, Name = "The Dark Knight", Released = 2008, Rating = 5 },
new() { Id = 5, Name = "Fight Club", Released = 1999, Rating = 4 },
new() { Id = 6, Name = "Pulp Fiction", Released = 1994, Rating = 4 },
new() { Id = 7, Name = "Forrest Gump", Released = 1994, Rating = 4 },
new() { Id = 8, Name = "The Shawshank Redemption", Released = 1994, Rating = 5 },
new() { Id = 9, Name = "The Godfather", Released = 1972, Rating = 5 },
new() { Id = 10, Name = "The Godfather: Part II", Released = 1974, Rating = 5 }
};
}
The conventual way to get distinct release year is using GroupBy then using Select(g => g.First()).
var distinctMoviesByReleaseYear =
MockedData.MovieList()
.GroupBy(m => m.Released)
.Select(g => g.First())
.ToList();
DistinctBy is easier as we need only specify the property, in the case Released.
var distinctList = MockedData.MovieList()
.DistinctBy(movie => movie.Released)
.ToList();
This does not mean GroupBy still can not be used, it may be a personal choice or that DistinctBy is not right for a specific task.
Example where GroupBy may be a better direction is still using Movie model, get distinct by Name starts with "The" and by Rating.
public static void GroupMoviesNameStartsWithAndRating()
{
PrintCyan();
var moviesGroupedByNameAndRating = MockedData.MovieList()
.GroupBy(m => new MovieGroupItem(
m.Name.StartsWith("The", StringComparison.OrdinalIgnoreCase),
m.Rating));
AnsiConsole.MarkupLine($"[{Color.Chartreuse1}][u]Name Released Rating[/][/]");
foreach (var group in moviesGroupedByNameAndRating)
{
if (!group.Key.StartsWithThe) continue;
foreach (var movie in group)
{
Console.WriteLine($"{movie.Name, -25} {movie.Released, -12}{movie.Rating}");
}
}
}
DistinctBy with one property
The following model will be used in several examples.
/// <summary>
/// Represents a member with personal details and address information.
/// </summary>
public class Member
{
public int Id { get; set; }
public bool Active { get; set; }
public string FirstName { get; set; }
public string SurName { get; set; }
public Gender Gender { get; set; }
public Address Address { get; set; }
public override string ToString() => Id.ToString();
}
In this example, some how data has been sent with duplicate primary keys.
public static IEnumerable<Member> MembersList3() =>
[
new() { Id = 1, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female},
new() { Id = 2, Active = false, FirstName = "Sue", SurName = "Williams", Gender = Gender.Female},
new() { Id = 1, Active = false, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
new() { Id = 4, Active = true, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
new() { Id = 5, Active = true, FirstName = "Clair", SurName = "Smith",Gender = Gender.Other},
new() { Id = 1, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female },
new() { Id = 7, Active = true, FirstName = "Sue", SurName = "Miller", Gender = Gender.Female }
];
The following works against the primary key.
public static void DistinctByPrimaryKey()
{
PrintCyan();
var distinctList = MockedData.MembersList3()
.DistinctBy(member => new
{
member.Id
})
.ToList();
MemberHeader();
foreach (var (index, item) in distinctList.Index())
{
Console.WriteLine($"{index,-7}{item.Id,-10}{item.FirstName,-10}{item.SurName}");
}
}
DistinctBy with multiple properties
Get distinct by first name, surname and active.
Data
public static IEnumerable<Member> MembersList1() =>
[
new() { Id = 1, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female},
new() { Id = 2, Active = false, FirstName = "Sue", SurName = "Williams", Gender = Gender.Female},
new() { Id = 3, Active = true, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
new() { Id = 4, Active = true, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
new() { Id = 5, Active = true, FirstName = "Clair", SurName = "Smith",Gender = Gender.Other},
new() { Id = 6, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female },
new() { Id = 7, Active = true, FirstName = "Sue", SurName = "Miller", Gender = Gender.Female }
];
Code example
public static void DistinctByFirstLastNameAndActive()
{
PrintCyan();
var distinctList = MockedData.MembersList1()
.DistinctBy(member => new
{
member.FirstName,
member.SurName,
member.Active
})
.ToList();
MemberHeader();
foreach (var (index, item) in distinctList.Index())
{
Console.WriteLine($"{index,-7}{item.Id,-10}{item.FirstName,-10}{item.SurName}");
}
}
Distinct by on sub property
In the follow example use DistinctBy, on Address property of Member model.
public class Member
{
public int Id { get; set; }
public bool Active { get; set; }
public string FirstName { get; set; }
public string SurName { get; set; }
public Gender Gender { get; set; }
public Address Address { get; set; }
public override string ToString() => Id.ToString();
}
public class Address
{
public int Id { get; set; }
public string Street { get; set; }
public string City { get; set; }
public string State { get; set; }
}
Data
public static IEnumerable<Member> MembersList4() =>
[
new()
{
Id = 1,
Active = true,
FirstName = "Mary",
SurName = "Adams",
Gender = Gender.Female,
Address = new() { Id = 1, Street = "123 Main St", City = "Portland", State = "NY" }
},
new()
{
Id = 2,
Active = false,
FirstName = "Sue",
SurName = "Williams",
Gender = Gender.Female,
Address = new() { Id = 2, Street = "124 Main St", City = "Anytown", State = "NY" }
},
new()
{
Id = 3,
Active = false,
FirstName = "Jake",
SurName = "Burns",
Gender = Gender.Male,
Address = new() { Id = 3, Street = "123 Main St", City = "Anytown", State = "CA" }
},
new()
{
Id = 4,
Active = true,
FirstName = "Jake",
SurName = "Burns",
Gender = Gender.Male,
Address = new() { Id = 4, Street = "123 Main St", City = "Anytown", State = "PA" }
},
new()
{
Id = 5,
Active = true,
FirstName = "Clair",
SurName = "Smith",
Gender = Gender.Other,
Address = new() { Id = 5, Street = "123 Main St", City = "Anytown", State = "NJ" }
},
new()
{
Id = 6,
Active = true,
FirstName = "Mary",
SurName = "Adams",
Gender = Gender.Female,
Address = new() { Id = 1, Street = "123 Main St", City = "Portland", State = "NY" }
}
];
Code sample
public static void DistinctByAddress()
{
PrintCyan();
var distinctList = MockedData.MembersList4()
.DistinctBy(member => new
{
member.Address.Id,
member.Address.Street,
member.Address.City,
member.Address.State
})
.ToList();
MemberHeader();
foreach (var (index, item) in distinctList.Index())
{
Console.WriteLine($"{index,-7}{item.Id,-10}{item.FirstName,-10}{item.SurName}");
}
}
The results in this case are not apparent. For learning purposes, open the MockedData
file, drill down to MembesList4
and set a breakpoint on MemberHeader
.
Run the project, when the breakpoint is hit, use the debugger local window to examine the results.
Summary
In this article, DistinctBy for NET6 and higher offers a new way to get distinct items from a list along with in one case GroupBy verses DistinctBy.
Top comments (1)
Absolutely incredible! π».