The repository pattern is another abstraction, like most things in Computer Science. It is a pattern that is applicable in many different languages. In fact a lot of developers use the repository pattern and don't even realize it.
In this post I am going to transform a piece of code. We start with a piece of code that is loading a single record from a database. Once the record is fetched it is returned to the caller. Let's take a look at some code.
Needs Improvement
The record we are loading out of our database is PersonModel
.
public class PersonModel
{
public string Name { get; set; }
public int Age { get; set; }
}
The service that is loading a person out of the database is ICompanyLogic
. It consists of the following method definition.
public interface ICompanyLogic
{
PersonModel GetPersonByName(string name);
}
The implementation of the ICompanyLogic
is handled by CompanyLogic
.
public class CompanyLogic: ICompanyLogic
{
private IPersonDataContext _personDataContext;
public PersonService(IPersonDataContext personDataContext)
{
_personDataContext= personDataContext;
}
public PersonModel GetPersonByName(string name)
{
using(var ctx = _personDataContext.NewContext())
{
var person = ctx.People.First(p => p.Name.Equals(name));
return person;
}
}
}
So far, this isn't so bad. We have a business service CompanyLogic
that can retrieve a single person from the database.
But then we have a new requirement that says we also need a way to load a company from another database. So we need to add a new method and extend CompanyLogic
.
CompanyModel
represents the model stored in the company database.
public class CompanyModel
{
public string Name { get; set; }
public int Size { get; set; }
public bool Public { get; set; }
}
We extend CompanyLogic
to have a method that returns a company by name.
public class CompanyLogic: ICompanyLogic
{
private IPersonDataContext _personDataContext;
private ICompanyDataContext _companyDataContext;
public PersonService(IPersonDataContext personDataContext,
ICompanyDataContext companyDataContext)
{
_personDataContext= personDataContext;
_companyDataContext = companyDataContext;
}
public PersonModel GetPersonByName(string name)
{
using(var ctx = _personDataContext.NewContext())
{
var person = ctx.People.First(p => p.Name.Equals(name));
return person;
}
}
public CompanyModel GetCompanyByName(string companyName)
{
using(var ctx = _companyDataContext.NewContext())
{
var person = ctx.Company.First(c => c.Name.Equals(companyName));
return person;
}
}
}
Now we are starting to see the problems with this initial solution. Here is a short list of things that are not ideal.
-
CompanyLogic
, knows how to access two different databases. - We have duplicated code with our
using
statements. - Our logic knows how people and companies are stored.
-
GetPersonByName
andGetCompanyByName
cannot be reused without bringing in all ofCompanyLogic
.
In addition to all of these things, how do we test CompanyLogic
in its current state? We have to mock the data context for people and companies to have literal database records. This is possible to do. But our hard work should be going into testing our logic, not mocking database objects.
Implementing Repository Pattern
The repository pattern adds an abstraction layer over the top of data access.
A little bit of abstraction goes a long way. With the repository pattern we can add a thin layer of abstraction for accessing the people and company databases. Then CompanyLogic
or any other logic can leverage those abstractions.
Let's begin by creating our IPersonRepository
interface and its accompanying implementation.
public interface IPersonRepository
{
PersonModel GetPersonByName(string name);
}
public class PersonRepository: IPersonRepository
{
private IPersonDataContext _personDataContext;
public PersonRepository(IPersonDataContext personDataContext)
{
_personDataContext= personDataContext;
}
public PersonModel GetPersonByName(string name)
{
using(var ctx = _personDataContext.NewContext())
{
return ctx.People.First(p => p.Name.Equals(name));
}
}
}
Then we can do something very similar for companies. We can create the ICompanyRepository
interface and its implementation.
public interface ICompanyRepository
{
PersonModel GetCompanyByName(string name);
}
public class CompanyRepository: ICompanyRepository
{
private ICompanyDataContext _companyDataContext;
public CompanyRepository(ICompanyDataContextcompanyDataContext)
{
_companyDataContext= personDataContext;
}
public CompanyModel GetCompanyByName(string name)
{
using(var ctx = _companyDataContext.NewContext())
{
return ctx.Company.First(p => p.Name.Equals(name));
}
}
}
We now have two separate repositories. PersonRepository
knows how to load a given person by name from the person database. CompanyRepository
can load companies by name from the company database. Now let's refactor CompanyLogic
to leverage these repositories instead of the data contexts.
public class CompanyLogic: ICompanyLogic
{
private IPersonRepository _personRepo;
private ICompanyRepository _companyRepo;
public PersonService(IPersonRepository personRepo,
ICompanyRepository companyRepo)
{
_personRepo= personRepo;
_companyRepo= companyRepo;
}
public PersonModel GetPersonByName(string name)
{
return _personRepo.GetPersonByName(name);
}
public CompanyModel GetCompanyByName(string companyName)
{
return _companyRepo.GetCompanyByName(companyName);
}
}
Look at that, our logic layer no longer knows anything about databases. We have abstracted away how a person and a company are loaded. So what benefits have we gained?
- The repository interfaces are reusable. They could be used in other logic layers without changing a thing.
- Testing is a lot simpler. We mock the interface response so we can focus on testing our logic.
- Database access code for people and companies is centrally managed in one place.
- Optimizations can be made at a repository level. The interface is defined and agreed upon. The developer working on the repository can then store data how she sees fit.
Repository pattern provides us with a nice abstraction for our data. This is applicable to a variety of languages. The moral of the story is that data access should be a single responsibility interface. This interface can then be injected into business layers to add any additional logic.
Hungry To Learn Amazon Web Services?
There is a lot of people that are hungry to learn Amazon Web Services. Inspired by this fact I have created a course focused on learning Amazon Web Services by using it. Focusing on the problem of hosting, securing, and delivering static websites. You learn services like S3, API Gateway, CloudFront, Lambda, and WAF by building a solution to the problem.
There is a sea of information out there around AWS. It is easy to get lost and not make any progress in learning. By working through this problem we can cut through the information and speed up your learning. My goal with this book and video course is to share what I have learned with you.
Sound interesting? Check out the landing page to learn more and pick a package that works for you, here.
Top comments (15)
The biggest concern I have with the repository pattern in the .NET/C#/MVC world is that it can lead to a practice where developers fail to take advantage of the power of the database engine (typically SQL Server or Oracle). For example, you suggested optimizing at the repository level. This can easily lead to a situation where inefficient queries are used and powerful database features, such as stored procedures, indexes and schemas, are ignored.
Really?
I think EF's support for Stored Procedures is great.
It's abstracted to the point of just calling a method on your DbContext
If devs don't take advantage of this, you can't really blame EF.
Basically, the problem comes from developers tying EF directly to the table structure of the database and not using views, stored procs and so forth to take advantage of the power of the database. Then, they use LINQ to handle all the sorting and filtering of the data. When you walk into a legacy system where it was developed this way it can be very difficult to change.
But doesn’t LINQ to EF delay execution so stuff like order by is implemented at the database level, and not in memory, effectively leveraging the power of the database?
Stored procedures are faster because they have a predictable execution plan that takes full advantage of the database engine, primarily using a cached execution plan. While LINQ may delay execution, the execution plan in the database is reconstructed each time a query is ran and can often result in performance issues.
Complex queries that involve joins, groupings and such between tables or pulling data from tables in other databases are usually very inefficient in LINQ as compared to SQL. This is something that's more common when dealing with existing legacy databases in a corporate IT environment vs a "green field" new development situation. A lot of this data isn't laid out in a way that's particularly friendly to LINQ or EF in general.
LINQ via EF is also considerably slower when it comes to bulk insert, update and deletes.
The advantage of LINQ primarily lies in the ability to shield .NET developers from having to learn how to develop and maintain good SQL. It also makes debugging easier for them since they don't have to learn how to debug queries or how to craft execution plans.
Okay, but like you're bouncing all over the place now. And not replying to my statements direct.
I understand that stored procedures can be more efficient, but again, EF has support for just calling stored procedures or views, so it's not EF's fault it devs don't use it.
If I "sort" in LINQ using LINQ to EF, it just adds an ORDER BY to the query it sends to the DB when you finally call .ToList(), so it is doing that on the database.
I don't understand what you mean by LINQ vs SQL... LINQ isn't an ORM, it's just a way for devs to write SQL using connectors like LINQ to SQL. The advtange of LINQ lies in it's ability to save .NET Developers time from wiring up a query and an object mapping when they need to do is db.Books.Where(book => book.Author == "George R.R. Martin");
Using EF doesn't prevent developers from doing any of powerful database things that you suggest doing. Maybe some developers don't know how to take advantage of these things, and that's what you've had to experience, but that's not EF's fault.
EF gives you the ability to decide where you leverage Stored Procedures and views, where you leverage ADO.NET and even where you can write your own SQL. The only thing EF does is track your entities in memory to know when they've changed. If you know when and where to leverage the database, EF will permit you to do so.
As Sam said you can switch from Lazy loading to Eager loading within EF in a few clicks with Lazy loading being the most used while Eager loading is good in some specific scenarios where you know that connected objects will be used in the future.
Also, use IQueryable vs IEnumerable to get the sorting done on the server.
However, it is evident that EF is slower than traditional SQL principles(connected and disconnected scenario) with the connected one being the fastest in pure speed. It's all about performance vs speed of development, and it really depends on the project you are working on.
I would go for EF with LINQ if performance is not a top priority, just because it saves so much time
Yes rogue repositories is a very real thing. Especially in this example where I am using Linq. Newer developers can often make very costly mistakes using Linq with EF. How have you seen sprocs used in a repository pattern? Is there a good rule of thumb there?
The most effective way I've used it was in Oracle where I could use schemas and packages to categorize the data repositories. The method I used was to require the procedure or view results match the associated interface. The calls in the backend code were similar to what Sam illustrated above, with some additional checking to insure valid objects were being returned.
As with everything, unit testing of these calls was essential, including checking the time to insure the queries were efficient.
I would say that the
CompanyLogic
class should not know about two different models or sets or whatever you call it. You could split that into two different classes likePersonCompanyLogic
andCompanyLogic
each in charge of their own models.That is a great suggestion Ivan and I agree. Separate logic layers is common but it is usually based on the logic behavior.
Thank you for the article
Found in 7th code snippet:
ICompanyDataContextcompanyDataContext
_companyDataContext= personDataContext;
. Should be_companyDataContext= companyDataContext
I think, in same snippet
PersonModel GetCompanyByName(string name);
should beCompanyModel GetCompanyByName(string name);
Actually the simplest article I have ever read regarding design pattern!! thank you so much..
I have one question .. why you used PersonService method to initiate IPersonRepository and
ICompanyRepository ..
shouldn't it to be initiated in constructor??
There is a bug in this piece of code:
public interface ICompanyRepository
{
PersonModel GetCompanyByName(string name);
}