When working with object-relational mapping (ORM) frameworks such as Entity Framework, we often have to choose between one of the three available approaches to model the data structure: Model-First, Database-First, and Code-First. Each one of them comes with its fair amount of advantages and disadvantages, as the experienced readers and seasoned DB developers will most certainly know: nonetheless, it could be useful to spend some words on each one of them to help the less experienced coders in taking the best choice for their specific scenario.
IMPORTANT: We'll mostly talk about Entity Framework / EF Core in the following samples, but the concepts can be easily adapted to any other ORM framework as well.
If we’re not familiar with most IDE design tools such as the Microsoft Visual Studio's XML-based DataSet Schema (XSD) and the Entity Designer Model XML visual interface (EDMX), the Model-First approach can be rather confusing. The key to understand it is to acknowledge the fact that the word Model here is meant to define a visual diagram built with the design tools; that diagram will then be used by the Framework to autogenerate the Database SQL script and the Data Model source code files.
To summarize it, we can say that going Model-First basically means "working on a visual diagram and letting the ORM framework - in our example, the Entity Framework - create/update the rest accordingly":
Such approach has the following benefits:
- We'll be able to create the Database schema and the class diagram as a whole using a visual design tool, which can be great when the data structure is quite big
- Whenever the Database changes, the model can be updated accordingly, without data loss
Yet also the downsides below:
- The diagram-driven, autogenerated SQL scripts can lead to data loss in case of updates. An easy workaround for that will be generating the scripts on disk and manually modify them, which will require decent SQL knowledge.
- Dealing with the diagram can be tricky, especially if we want to have precise control over our Model classes; we won’t always be able to get what we want, as the actual source code will be autogenerated by a tool
Given the disadvantages of Model-First, we can think that Database-First might be the way to go. This can be true if we either have a Database already or don’t mind building it beforehand. That being the case, the Database-First approach is similar to the Model-First one, except that it goes the other way around; instead of designing the EDMX manually and generating the SQL script to create the Database, we build the latter and then generate the former using the Entity Framework Designer tool.
We can summarize it by saying that going Database-First will mean "building the Database and letting Entity Framework create/update the rest accordingly":
Here are the pros of this alternative approach:
- If we have an already-existing Database in place, this will most likely be the way to go as it will spare us the need to recreate it
- Risk of data loss will be kept to a minimum, because any change or update will be always performed on the Database
And here are the cons:
- Manually updating the Database can be tricky if we’re dealing with clusters, multiple instances, or a number of development/testing/production environment, as we will have to manually keep them in sync instead than relying upon code-driven updates/migrations or autogenerated SQL scripts
- We will have even less control over the autogenerated Model classes (and their source code) than using Model-First approach; it will require an extensive knowledge over EF conventions and standards, otherwise we’ll often struggle to get what we want
Last but not least comes the Entity Framework flagship approach since EF4, which enables an elegant, highly-efficient Data Model development workflow. The appeal of this approach can be easily found in its premise; the Code-First approach allows the developer to define model objects using only standard classes, without the need of any design tool, XML mapping files, or cumbersome piles of autogenerated code.
To summarize it, we can say that going Code-First means writing the Data Model entity classes we’ll be using within our project and let Entity Framework generate the Database accordingly:
The Code-First approach comes with the following benefits:
- No need for diagrams and visual tools whatsoever, which can be great for small-to-medium size projects as it will save us a lot of time
- A fluent code API that allows the developer to follow a Convention over Configuration approach, to handle the most common scenarios, while also giving him the chance to switch to custom, attribute-based implementation overrides whenever he needs to customize the Database mapping
Yet it also has these downsides:
- A good knowledge of the ORM programming language and conventions - C# for Entity Framework - is required
- Maintaining the Database can be tricky sometimes as well as handling updates without suffering data loss; the Entity Framework's migrations support, added in EF 4.3 to overcome the issue and continuously updated since then, greatly mitigates the problem, although it also affected the learning curve in a negative way
Taking a choice
As we can easily see by reading the advantages and disadvantages of these three options, there is no such thing as an overall better or best approach; conversely, we can say that each project scenario will likely have a most suitable approach.
However, as long as we're dealing with a rather small project - for example, a microservice - and/or we’re aiming for a flexible, mutable small-scale data structure, adopting the Code-First approach will almost always be a good choice.