Entity Framework Vote of No Confidence - relevant in .NET 4?

.Net Problem Overview

I'm deciding on an ORM for a big project and was determined to go for ADO.NET Entity Framework, specifically its new version that ships with .NET 4. During my search for information on EF I stumbled upon ADO .NET Entity Framework Vote of No Confidence which I'm not sure how to take.

The Vote of No Confidence was written sometime in 2008 to convince Microsoft to listen to specific criticism for EF v1.

It's not clear whether the claims made in the Vote of No Confidence are still valid (in .NET 4) and if they're serious enough to use other solutions. NHibernate is a mature alternative, but I don't know what problems it brings. I'm generally more inclined towards a Ms solution, mainly because I can count on integration with VS and on their developer support.

I would appreciate examples of how the problems mentioned in the Vote of No Confidence affect in real world projects. More importantly, are the claims made there still relevant in EF for .NET 4?

.Net Solutions

Solution 1 - .Net

I've always felt that much of what underlay the "vote of no confidence" was an attempt to use the EF as if it were an NHibernate clone. It isn't, and even in EF 4 attempting to use the EF as though it were an NHibernate knockoff is probably going to end in failure, although you may get a little further before failing. As a trivial example, most people use LINQ in NHibernate on a minimal basis, if at all, whereas I don't think you can be productive in EF at all unless you use LINQ quite heavily.

On the other hand, I've been quite successful at using the EF 1 on its own terms, and have managed to not allow claims people make in blog posts to get in the way of making it work for me. I look forward to using many of the new features in EF 4, but I'd be happy to work on a well-structured EF 1 project any time. (For that matter, I'm happy to work with NHibernate, too, and wouldn't criticize it for not acting like the EF.)

So I'm trying to suggest, in a somewhat delicate way, that before you can decide if "the claims made in the Vote of No Confidence are still valid (in .NET 4)..." you must first decide if those claims were ever valid for you and the way you work. If your personal understanding of O/R is hard-wired to NHibernate, then EF 4 is probably still going to seem second-rate to you. If, on the other hand, you're willing to learn the EF way of working, then probably even EF 1 will seem better than you've heard.

To address the "no confidence" claims directly, and examine both their substance and what's changed in EF 4:

> INORDINATE FOCUS THE DATA ASPECT OF ENTITIES LEADS TO DEGRADED ENTITY ARCHITECTURES:

This is a misunderstanding of the Entity Framework's entity data model. (Or, a difference of opinion, if you prefer.) But either way, it's a feature, not a bug. The Entity Framework is designed around the more general case of data services, not just O/R modeling in particular. Putting behaviors on entities returned from a data service leads to a CORBA-style disaster. Unlike ORMs where you are, to some degree, stuck with whatever type comes out of the ORM blackbox, with the Entity Framework model you are expected to project onto business types. In this case, the mapped entity types will never even be materialized.

This is a substantive difference between the Entity Framework model and many other ORMs. Personally, I find separating business behaviors from O/R mapping to be quite a bit cleaner than lumping them together. You don't have to agree with this idea, but it is clearly a design decision, not an oversight.

> EXCESS CODE NEEDED TO DEAL WITH LACK OF LAZY LOADING:

The EF 4, for better or worse, has lazy loading.

I say "for better or worse" because lazy loading makes it very easy to generate excess database queries. It works fine so long as you keep a close eye on what's going on under the hood, but most people don't do that. I find projection to be a better alternative to lazy loading, eager loading, or explicit loading most of the time.

Still, there are times when lazy loading is convenient. So I'm glad to see it added in EF 4.

> SHARED, CANONICAL MODEL CONTRADICTS SOFTWARE BEST PRACTICES:

It's hard to know what to make of this, as some of the supporting text isn't even coherent English, e.g.:

> The failure-prone canonical model approach wasn’t difficult for a lack of elaborate tooling along the lines of the Entity Framework.

This section seems to suggest that the Entity Framework imposes some kind of requirement, or at least strong bias, towards using a single, canonical data model for a complex system. I'm not sure I agree, but it's difficult to tell, given the lack of any specific example in this section. So I'll tell you my own biases on the subject, and you can agree or disagree with me:

It is often a mistake to use a single model for a large system, depending upon how large the system actually is. Nothing in the Entity Framework requires you to use a single model, however. On the other hand, the Entity Framework, especially in version 1, does not go out of its way to make it easy to combine multiple models.

Now, a single, large application for a complex system can be as big of a mistake as a single, large data model. So it would not be correct for the Entity Framework to make it easy to combine many tiny models into one overly large application; that would simply replace one problem with another.

On the other hand, I think it does make sense to make it easy to build a large system out of services partitioned in a way which suits the problem domain. I think that WCF data services, a separate technology from the Entity Framework, but one which supports the Entity Framework very well, are useful for this.

I do think that the Entity Framework could, in some future version, make it easier to combine two or three models into a single application when necessary. You can do this now, but there is some manual work involved. But as I said above, I wouldn't want to "fix" an issue of an overly large data model by facilitating/encouraging creation of an overly large application.

> LACK OF PERSISTENCE IGNORANCE CAUSES BUSINESS LOGIC TO BE HARDER TO READ, WRITE, AND MODIFY, CAUSING DEVELOPMENT AND MAINTENANCE COSTS TO INCREASE AT AN EXAGGERATED RATE:

This section makes claims which I find erroneous:

> The Entity Framework encourages the Anemic Domain Model anti-pattern by discouraging the inclusion of business logic in the entity classes.

See above. I think that the job of the entity types is to map between relational space in object space. Per the Single Responsibility Principle, these types should only need to be modified when their sole job changes. If business processes change, then this is a responsibility unrelated to O/R mapping. Perhaps limitations of other ORMs impose a technical barrier on separating these responsibilities. It's okay to bend rules when technology dictates, if the cost of design purity is excessive. But I strongly favor the approach of behavior-less entity types.

> In its current state, EF entity classes cannot be effectively unit tested independently of the database.

This is just wrong. Whoever wrote this didn't understand what they were talking about. None of our unit tests touch the DB, ever, and many involve the EF.

In so far as the substance of the title of this section goes, there is a change for EF 4. It is now possible to have entirely persistence-ignorant entity types, if that helps your design. However, from the earliest version of the Entity Framework on words, you have been able to project onto POCOs. So persistence ignorance has always been available when required. Having persistence ignorance on the entity types themselves allows change tracking with a persistence-ignorant object. That may be useful in some cases. But it's a substantially smaller subset of cases than the bogus claims about unit testing, which lessens the impact of the point the document makes by a lot.

> EXCESSIVE MERGE CONFLICTS WITH SOURCE CONTROL IN TEAM ENVIRONMENTS:

Is merging XML actually that difficult? If so, perhaps one should look into a new merge tool. I don't find this problematic.

However, there is a real issue here, although, again, it's a lot more narrow than the document claims. Rather than repeat myself, I'll just point you towards my post on that subject.

In EF 4, one can use code-first models rather than XML models in order to split up a model into many different files.

Solution 2 - .Net

Entity Framework has improved since version 1 and this blog post from a NHibernate contributor compares NHibernate and Entity Framework 4.0.

Solution 3 - .Net

EDIT: THIS USED TO BE TRUE, BUT IT'S NOT ANYMORE.

As a person who has used both Entity Framework and NHibernate... I strongly suggest NHibernate. Normally if a FOSS and a MS tech are present, I suggest the MS tech, but I strongly disagree with that for EF. I use EF4 on a day-to-day basis at work, and we have to create a lot of workaround because of EF. Two years ago I used EF for about one year, and then I changed companies, and I've been working with EF for the past year. NHibernate, 2 years ago, is ahead of EF4.

Here's the points they brought up.

Excessive Merge Conflicts With Source Control in Team Environments:

This was partially fixed, from what I hear. They moved the position data for the designer to the bottom of the .edmx, so it's no longer a horrible problem, but still annoying. If me and a co-worker both try modifying the .edmx at the same time, we tend to get horrible merge conflicts because the entire bottom of the file is used to store the position data of the tables in the designer. Our workaround to this problem is to use SVN file locking so we don't double-edit it. Or I ignore locking, and if I get a merge conflict, I just accept their changes and redo my work. Most of my changes aren't so large they take very long to re-do. If this was the only problem, I'd live with it.

If you're using code-first (in EF 4.1) this is a non-issue.

Excess Code Needed to Deal With Lack of Lazy Loading:

They added lazy loading in 4.0.

But it's loading still preforms like a piece of trash. Eager loading is slow, which is a common optimization when you need to speed up your code. I'm running into cases where I have to make 10k+ calls to the database when I'd rather use eager loading. We've timed it, and in many cases it's faster to make multiple database calls (doing myobject.TablenameReference.Load() in a loop) to a local database then it is to use .Include("Tablename"). Yes, I know it's very counter-intutivie, but the numbers don't lie. Also, there's no way to specify the fetch stragety, so you can't specify to Join-fetch or not. So I'd say it's improved, but not near as good as NHibernate.

Inordinate focus on the data aspect of entities leads to degraded entity architectures:

Yeah, this is still as true. Again a good example is order.Status. We'd really like that to be an enum, but because of the way EF is designed, we don't have any other choice besides making it a string. They may fix the enum problem in the future, but the lack of control between the how the mapping is done between the table and the object is the true complaint. NHibernate lets you specify methods for doing mappings, to deal with this case.

To extend a point from Craig Stuntz's response, EF is designed around if you want to take the data model, and select directly from it. (IE myModel.Orders.Where(order => order.Status == "NEW").Select(order => order.Customer.FirstName, order=> order.Customer.LastName).) EF's model ends up being really hard to write automated tests around if you don't want to hit up the DB. If you want a repository, where you ask for an object that meets some criteria, and it returns the whole object, that's what NHibernate works better at. (IE var order = myOrderRepository.GetByStatus(OrderStatus.New)).

Another problem I have with EF is it's total lack of extensiblity. One problem we have is we have Enums for order status. But if we do myModel.Orders.Where(order => order.Status == OrderStatus.New.ToString()), EF will crash on that query because it doesn't know the .ToString() method. It uglifies our code a lot because we can't add support for that. Also there are a lot of internal methods, so we need to cause a odd behavior to happen, we can't do that.

If you're using NHibernate, Linq adds a lot of features to nhibernate that make it much better. Using a conventions-based model, it requires very little code for most of your mappings. If you're using an existing database, Nhibernate lets you specify non-standard conventions to use, and then have them map up, and everything is easily managed. EF 4.0 (and I don't think 4.1) doesn't have support for anything like that.

I hope this helps you out.

Solution 4 - .Net

There is lazy loading in EF 2, http://microsoftpdc.com/Sessions/FT10?type=wmvhigh

Content Type	Original Author	Original Content on Stackoverflow
Question	Asaf R	View Question on Stackoverflow
Solution 1 - .Net	Craig Stuntz	View Answer on Stackoverflow
Solution 2 - .Net	Arve	View Answer on Stackoverflow
Solution 3 - .Net	Zackary Geers	View Answer on Stackoverflow
Solution 4 - .Net	robert_d	View Answer on Stackoverflow