What's wrong with databases?
I won't try to sell you the idea that stored procedures are the best and that everybody needs to throw away all their ORM code and move to the database world. I believe that every team has to choose the best tool that not only helps them to solve their problem and but also makes them more productive.
Whether you landed on a project with many stored procedures, you're a DB geek or you just simply consider that stored procedures are the right tool for your job, you will face many challenges if you want to improve your team's productivity.
What I noticed is that sometimes, even when coding a feature in a stored procedure is the right choice, a developer won’t go that way just because she wouldn’t feel productive on that technology. And that’s usually related to one or more of these problems:
A shared database
Teams tend to have only one SQL Server with a bunch of different versions of the same database, e.g. Dev, Test or CI.
So, if one developer makes a change in the DEV database it will affect others developers immediately. Some teams will try to mitigate this problem by altering the database as little as possible, and @channel’ing at Slack, sending an email or even yelling in the office when they do so.
Lack of Source control
Maybe you’ve never thought about this but source control plays an important part in your team’s productivity. The easier you can integrate your code, track changes or switch between tasks the more productive you are going to be.
Very few teams put their database under source control, how would you do that? Don’t worry, we'll get there.
No debugging
Are you telling me that people debug procedures? Yeah... I think only two developers do that, but it's possible.
Being productive
These are some tips and tools I've found with my team that helped us to stay productive over time in this scenario.
The database is part of your code
Even if you don't use stored procedures your tables definitions are part of your code. Ok, you might say "I'm a code first developer, I don’t care about the database" and that's fine, this post won't help you at all.
But if you are a "database first" developer, the table definition should be part of your code. If you add a new field in a table you will need to create a branch, add that new field to the table definition and commit everything into the same branch as the code using that new field.
The benefit is quite obvious, when your PR is accepted not only your code will be merged but also that new field in the database.
How can this be accomplished?
I found in Visual Studio the solution to this problem. Let's take a look at it.
Creating a DB Project
Let's use our well-known Northwind database as an example. Just go to Visual Studio, click "New project" and look for "SQL Server Database project".
Import your current database
I bet you already have a database, if so, you will be able to import the whole database definition using the import tool.
Then you will need to setup the database connection and import settings, let's keep this simple.
Voilà! You now have your database ready to be committed into your repository.
One developer, one database
Perfect, we have our database under source control, now what?
The next thing you must have is one database per developer where she can have the freedom to update/recreate it when needed. You can accomplish this in three ways, ok there might be more ways but these are the three I recommend:
Shared server
This could be the easiest solution if you already have one server used by all developers. You could use a naming convention to identify each database, something super smart such as <MyProject>Dev<DevName>
.
Pros
- There is no much more to setup, just create one database per dev and that's all.
- Collaboration. Devs can access each other's database making it easier to collaborate.
Cons
- If you are not using a shared server right now you might need to setup one server and think about security, VPNs and things like that.
- It might be a slower experience for the developer compared with a local environment.
- One developer might slow down your server if she is executing some heavy task.
LocalDB
LocalDB is a lightweight version of the SQL Server Express Database Engine that is targeted for program development.
You might not know it but if you have Visual Studio installed there is a high chance that you already have LocalDB installed. You can check this out by opening the SQL Server Object Explorer.
If you don't find a LocalDB there you can see how to install it here. There are other versions you can install, for instance the full Microsoft SQL Server Developer Edition.
Pros
- Your localDB is already there, you just need to use it.
- It's a local database, you won't need a VPN or request access to a server.
- A local database will speed up your environment, specially for remote developers.
Cons
- Running a SQL Server process could be quite expensive for you computer, slowing down your development environment.
Local Server on Docker
This solution is similar to the localDB one, but better suited if you use Mac OS.
When running Visual Studio on a virtual machine on Mac OS, running your SQL Server on a Docker container (directly on the Mac OS) will speed up your virtual machine. I would strongly recommend this solution if you use Mac OS.
You can read about running SQL Server on Docker here.
Developers need data
As each developer would have her own database, having data to create an useful and ready to use database becomes another very important piece of the project. It's very easy to create all this data if you already have an existing database.
First let's create a Data folder where we'll save all our insert scripts.
SQL Server Management Studio has a very cool tool to generate scripts, it is called... drum rolls... Generate Scripts!
Once there, we need to pick the tables we want to generate.
Then we select the location, we will need to enter the data folder we created on the SQL project.
We need to go to the Advanced option and indicate that we only want the data.
And we are done!
Now if we go back to the SQL Project we'll see all the scripts we’ve just generated, let's include those files in the project.
By default, when we include files in a project the Build Action is set to Build, but as these files are just insert scripts we need to set them to None.
So, how do we make use of these scripts? We need to create a Post Deployment Script and include all these inserts there.
So now we can complete this file with all the insert scripts we created before.
DB Syncing
Perfect! We have a SQL Project with our tables, stored procedures and even our test data! Now we have to use it.
Publish tool
Publish is the tool you would use to create the database from scratch. So let's go to the publish option.
Once there, we just need to pick our connection string and the destination database.
Under the advanced section there are some interesting options, like "Always re-create database" or "Block incremental deployment if data loss might occur".
Then, just hit publish and voilà! You have a new database with all your tables populated and ready to be used!
Schema Compare tool
Schema compare is the tool that you would use every day, as you might guess, to compare your SQL code and your database and apply those changes.
Once there you will be able to see not only what objects are going to be changed but also what the changes are.
Before hitting "Update" you might want to go the Schema compare options and uncheck "Block on possible data loss" because if you want to apply the changes made in another branch you might want to accept that data will be lost.
ProcUpdater
Schema compare is perfect at his job, it can create a database from scratch, create users, schemas, handle table and column renames, everything. But it can also slow down your development process. The compare step might take up to 3 minutes and the Update itself another 3 minutes, if you do that ten times a day you would end up wasting one hour waiting for the DB to sync up.
So I created a small app called ProcUpdater (yes, I was inspired that day). ProcUpdate will watch your Stored Procedures folder and apply your stored procedures changes immediately. It can even apply all the stored procedures changes when you switch branches. It's not perfect, it won't apply table changes, but it could save you 1 hour a day!
CI
This workflow fits perfectly if you need to run integration tests in a CI environment. If you use a Windows based CI environment like AppVeyor you will be able to deploy your database in that environment every time you build a branch. You just need to run msdeploy.exe
. It could be something like this:
msbuild "NorthwindDB.sqlproj" /p:Configuration=Release /p:Platform="Any CPU"
"C:\Program Files\IIS\Microsoft Web Deploy V3\msdeploy.exe" -verb:sync -source:dbDacFx="NorthwindDB.dacpac" -dest:dbDacFx="...",dacpacAction=Deploy,CreateNewDatabase=True'
Final words
I use to say that most of the time you can hide complexity but not remove it completely. If your project is using lots of stored procedures we cannot remove that complexity (and we haven’t even talked about dealing with staging environments) but we can speed up the process and make your team much more productive.
Top comments (11)
What bothers me most with using Visual Studio for my (very procedure-heavy and db-centric) database project is that Visual Studio encourages you to use a state-based publishing strategy. It works fine as long as you're working with stored procedures only or and are able to create all the tables from scratch if you have to change them.
The moment you try to do migrations like alter table or even data migrations (you might need them at some point if you for example also store metadata in the database or if you want to refactor your data tables) there are massive pitfalls with that approach.
I started using using flyway some months ago for the database migrations and was able to overcome serious problems we previously had with the publish-approach of Visual Studio (also blogged about it: cleandatabase.wordpress.com/2017/1...)
Regarding the migration process, Visual Studio keeps track of refactors in a refactorlog file. So if you rename the table Foo to Var it won't try to drop Foo and create Var, it'll perform a table rename.
But when we start talking about data... yeah, that's a headache.
I'll take a look at your post!
One thing that has always bugged me about Entity Framework and other .NET ORM's is how they encourage not leveraging the power of the database. Instead, you get monstrously complex layers built on the Repository and Unit of Work pattern to handle transactions and such. This is what SQL Server, Oracle and most other relational DB's have been specifically engineered over decades of development to do very well. It's like buying a bloodhound and then trying to sniff out your quarry yourself while the dog sits on the porch.
Agreed.
Another thing that's hard for me to buy is the Code First approach. I think you need to know and take control over the database your app will use. There is a higher chance of changing your ORM or even the platform you are working on than your database. You have to know your database.
Hi Dario. Very nice article.
I have some additional questions about DB migrations. You've described integration of your tool with CI process in case of integration tests. I assume that you create new database before each test run. My question is how is your approach to database updates on specific environments like test, stage or prod? Do you do it via CI or manually? Did you have a need to perform some data comparisons during such a update?
Hey Rafal, thanks!
Our workflow is something like this. When the CI grabs a branch to be tested it creates a brand new database using the command I shared in the CI section.
When a Pull Request is approved and a branch merged, our CI will test that commit again (to test how it works with the code integrated in the main branch) and then it will run the same command to update our test database, but this time with the arguments “CreateNewDatabase=false,DropObjectsNotInSource=false”.
After that another build will test our master (production) branch against that database to test that we’re not going to break production when that code gets to staging.
We didn’t get to the point to automate the staging and production deploy I think we are close but what we are doing now is using Visual Studio to compare our code with the production server, we would take a look at the diffs, check that everything looks as expected and then we would let Visual Studio do its job. We also have a deploy folder with data scripts we’d use to populate new tables or fields or fix data issues. This part is a manual step but it could be easily automated.
Thanks again for the detailed answer and for sharing your experience.
Your post is great! Good to know !
But IMHO, I think everything will be more simple if you use library like FluentMigrator, I also create a custom one for myself here github.com/minhhungit/DatabaseMigr...
Yes, you can track history of stored, function... with source control very easy
And more...
Hi Darío.
Nice write-up :)
I wonder why you add your INSERT data SQL as post-publish scripts? I guess it's fine for creating development databases but what happens when you deploy to production?
Thank you Jakob! I like how many of you are finding out the dots I left out :)
So yes, Script.PostDeployment.sql is a script used to populate a brand new database. In order to solve what you mentioned we use different configurations, besides "Debug" and "Release" we created a "deployDB" configuration.
We manually edit the sqlproj file to indicate we want to include Script.PostDeployment.sql only when the configuration is "deployDB".
In the CI instead of executing msbuild in "Release" we call it in "deployDB" mode "/p:Configuration=deployDB"
You could also do that to publish to production. You can create another configuration called, lets say "publishToProd", create your own PROD.PostDeployment.sql and there setup all the scripts you want to be executed when you publish that to prod.
I think this deserves another post :)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.