Why do we need entity objects?

SqlDatabaseOrmEntities

Sql Problem Overview


I really need to see some honest, thoughtful debate on the merits of the currently accepted enterprise application design paradigm.

I am not convinced that entity objects should exist.

By entity objects I mean the typical things we tend to build for our applications, like "Person", "Account", "Order", etc.

My current design philosophy is this:

  • All database access must be accomplished via stored procedures.
  • Whenever you need data, call a stored procedure and iterate over a SqlDataReader or the rows in a DataTable

(Note: I have also built enterprise applications with Java EE, java folks please substitute the equvalent for my .NET examples)

I am not anti-OO. I write lots of classes for different purposes, just not entities. I will admit that a large portion of the classes I write are static helper classes.

I am not building toys. I'm talking about large, high volume transactional applications deployed across multiple machines. Web applications, windows services, web services, b2b interaction, you name it.

I have used OR Mappers. I have written a few. I have used the Java EE stack, CSLA, and a few other equivalents. I have not only used them but actively developed and maintained these applications in production environments.

I have come to the battle-tested conclusion that entity objects are getting in our way, and our lives would be so much easier without them.

Consider this simple example: you get a support call about a certain page in your application that is not working correctly, maybe one of the fields is not being persisted like it should be. With my model, the developer assigned to find the problem opens exactly 3 files. An ASPX, an ASPX.CS and a SQL file with the stored procedure. The problem, which might be a missing parameter to the stored procedure call, takes minutes to solve. But with any entity model, you will invariably fire up the debugger, start stepping through code, and you may end up with 15-20 files open in Visual Studio. By the time you step down to the bottom of the stack, you forgot where you started. We can only keep so many things in our heads at one time. Software is incredibly complex without adding any unnecessary layers.

Development complexity and troubleshooting are just one side of my gripe.

Now let's talk about scalability.

Do developers realize that each and every time they write or modify any code that interacts with the database, they need to do a throrough analysis of the exact impact on the database? And not just the development copy, I mean a mimic of production, so you can see that the additional column you now require for your object just invalidated the current query plan and a report that was running in 1 second will now take 2 minutes, just because you added a single column to the select list? And it turns out that the index you now require is so big that the DBA is going to have to modify the physical layout of your files?

If you let people get too far away from the physical data store with an abstraction, they will create havoc with an application that needs to scale.

I am not a zealot. I can be convinced if I am wrong, and maybe I am, since there is such a strong push towards Linq to Sql, ADO.NET EF, Hibernate, Java EE, etc. Please think through your responses, if I am missing something I really want to know what it is, and why I should change my thinking.

[Edit]

It looks like this question is suddenly active again, so now that we have the new comment feature I have commented directly on several answers. Thanks for the replies, I think this is a healthy discussion.

I probably should have been more clear that I am talking about enterprise applications. I really can't comment on, say, a game that's running on someone's desktop, or a mobile app.

One thing I have to put up here at the top in response to several similar answers: orthogonality and separation of concerns often get cited as reasons to go entity/ORM. Stored procedures, to me, are the best example of separation of concerns that I can think of. If you disallow all other access to the database, other than via stored procedures, you could in theory redesign your entire data model and not break any code, so long as you maintained the inputs and outputs of the stored procedures. They are a perfect example of programming by contract (just so long as you avoid "select *" and document the result sets).

Ask someone who's been in the industry for a long time and has worked with long-lived applications: how many application and UI layers have come and gone while a database has lived on? How hard is it to tune and refactor a database when there are 4 or 5 different persistence layers generating SQL to get at the data? You can't change anything! ORMs or any code that generates SQL lock your database in stone.

Sql Solutions


Solution 1 - Sql

I think it comes down to how complicated the "logic" of the application is, and where you have implemented it. If all your logic is in stored procedures, and all your application does is call those procedures and display the results, then developing entity objects is indeed a waste of time. But for an application where the objects have rich interactions with one another, and the database is just a persistence mechanism, there can be value to having those objects.

So, I'd say there is no one-size-fits-all answer. Developers do need to be aware that, sometimes, trying to be too OO can cause more problems than it solves.

Solution 2 - Sql

Theory says that highly cohesive, loosely coupled implementations are the way forward.

So I suppose you are questioning that approach, namely separating concerns.

Should my aspx.cs file be interacting with the database, calling a sproc, and understanding IDataReader?

In a team environment, especially where you have less technical people dealing with the aspx portion of the application, I don't need these people being able to "touch" this stuff.

Separating my domain from my database protects me from structural changes in the database, surely a good thing? Sure database efficacy is absolutely important, so let someone who is most excellent at that stuff deal with that stuff, in one place, with as little impact on the rest of the system as possible.

Unless I am misunderstanding your approach, one structural change in the database could have a large impact area with the surface of your application. I see that this separation of concerns enables me and my team to minimise this. Also any new member of the team should understand this approach better.

Also, your approach seems to advocate the business logic of your application to reside in your database? This feels wrong to me, SQL is really good at querying data, and not, imho, expressing business logic.

Interesting thought though, although it feels one step away from SQL in the aspx, which from my bad old unstructured asp days, fills me with dread.

Solution 3 - Sql

One reason - separating your domain model from your database model.

What I do is use Test Driven Development so I write my UI and Model layers first and the Data layer is mocked, so the UI and model is build around domain specific objects, then later I map these objects to what ever technology I'm using the the Data Layer. Its a bad idea to let the database structure determine the design of your application. Where possible write the app first and let that influence the structure of your database, not the other way around.

Solution 4 - Sql

For me it boils down to I don't want my application to be concerned with how the data is stored. I'll probably get slapped for saying this...but your application is not your data, data is an artifact of the application. I want my application to be thinking in terms of Customers, Orders and Items, not a technology like DataSets, DataTables and DataRows...cuz who knows how long those will be around.

I agree that there is always a certain amount of coupling, but I prefer that coupling to reach upwards rather than downwards. I can tweak the limbs and leaves of a tree easier than I can alter it's trunk.

I tend to reserve sprocs for reporting as the queries do tend to get a little nastier than the applications general data access.

I also tend to think with proper unit testing early on that scenario's like that one column not being persisted is likely not to be a problem.

Solution 5 - Sql

Eric, You are dead on. For any really scalable / easily maintained / robust application the only real answer is to dispense with all the garbage and stick to the basics.

I've followed a similiar trajectory with my career and have come to the same conclusions. Of course, we're considered heretics and looked at funny. But my stuff works and works well.

Every line of code should be looked at with suspicion.

Solution 6 - Sql

I would like to answer with an example similar to the one you proposed.

On my company I had to build a simple CRUD section for products, I build all my entities and a separate DAL. Later another developer had to change a related table and he even renamed several fields. The only file I had to change to update my form was the DAL for that table.

What (in my opinion) entities brings to a project is:

Ortogonality: Changes in one layer might not affect other layers (off course if you make a huge change on the database it would ripple through all the layers but most small changes won't).

Testability: You can test your logic with out touching your database. This increases performance on your tests (allowing you to run them more frequently).

Separation of concerns: In a big product you can assign the database to a DBA and he can optimize the hell out of it. Assign the Model to a business expert that has the knowledge necessary to design it. Assign individual forms to developers more experienced on webforms etc..

Finally I would like to add that most ORM mappers support stored procedures since that's what you are using.

Cheers.

Solution 7 - Sql

I think you may be "biting off more than you can chew" on this topic. Ted Neward was not being flippant when he called it the "Vietnam of Computer Science".

One thing I can absolutely guarantee you is that it will change nobody's point of view on the matter, as has been proven so often on innumerable other blogs, forums, podcasts etc.

It's certainly ok to have open disucssion and debate about a controversial topic, it's just this one has been done so many times that both "sides" have agreed to disagree and just got on with writing software.

If you want to do some further reading on both sides, see articles on Ted's blog, Ayende Rahein, Jimmy Nilson, Scott Bellware, Alt.Net, Stephen Forte, Eric Evans etc.

Solution 8 - Sql

@Dan, sorry, that's not the kind of thing I'm looking for. I know the theory. Your statement "is a very bad idea" is not backed up by a real example. We are trying to develop software in less time, with less people, with less mistakes, and we want the ability to easily make changes. Your multi-layer model, in my experience, is a negative in all of the above categories. Especially with regards to making the data model the last thing you do. The physical data model must be an important consideration from day 1.

Solution 9 - Sql

I found your question really interesting.
Usually I need entities objects to encapsulate the business logic of an application. It would be really complicated and inadequate to push this logic into the data layer.
What would you do to avoid these entities objects? What solution do you have in mind?

Solution 10 - Sql

Entity Objects can facilitate cacheing on the application layer. Good luck caching a datareader.

Solution 11 - Sql

There are other good reasons for entity objects besides abstraction and loose coupling. One of the things I like most is the strong typing that you can't get with a DataReader or a DataTable. Another reason is that when done well, proper entity classes can make the code more maintanable by using first-class constructs for domain-specific terms that anyone looking at the code is likely to understand rather than a bunch of strings with field names in them used for indexing a DataRow. Stored procedures are really orthogonal to the use of an ORM since a lot of mapping frameworks give you the ability to map to sprocs.

I wouldn't consider sprocs + datareaders a substitute for a good ORM. With stored procedures, you're still constrained by, and tightly-coupled to, the procedure's type signature, which uses a different type system than the calling code. Stored procedures can be subject to modification to acommodate additional options and schema changes. An alternative to stored procedures in the case where the schema is subject to change is to use views--you can map objects to views and then re-map views to the underlying tables when you change them.

I can understand your aversion to ORMs if your experience mainly consists of Java EE and CSLA. You might want to have a look at LINQ to SQL, which is a very lightweight framework and is primarily a one-to-one mapping with the database tables but usually only needs minor extension for them to be full-blown business objects. LINQ to SQL can also map input and output objects to stored procedures' paramaters and results.

The ADO.NET Entity framework has the added advantage that your database tables can be viewed as entity classes inheriting from each other, or as columns from multiple tables aggregated into a single entity. If you need to change the schema, you can change the mapping from the conceptual model to the storage schema without changing the actual application code. And again, stored procedures can be used here.

I think that more IT projects in enterprises fail because of unmaintainability of the code or poor developer productivity (which can happen from, e.g., context switching between sproc-writing and app-writing) than scalability problems of an application.

Solution 12 - Sql

We should also talk about the notion what entities really are. When I read through this discussion, I get the impression that most people here are looking at entities in the sense of an Anemic Domain Model. A lot of people are considering the Anemic Domain Model as an antipattern!

There is value in rich domain models. That is what Domain Driven Design is all about. I personally believe that OO is a way to conquer complexity. This means not only technical complexity (like data-access, ui-binding, security ...) but also complexity in the business domain!

If we can apply OO techniques to analyze, model, design and implement our business problems, this is a tremendous advantage for maintainability and extensibility of non-trivial applications!

There are differences between your entities and your tables. Entities should represent your model, tables just represent the data-aspect of your model!

It is true that data lives longer than apps, but consider this quote from David Laribee: Models are forever ... data is a happy side effect.

Some more links on this topic:

Solution 13 - Sql

Really interesting question. Honestly I can not prove why entities are good. But I can share my opinion why I like them. Code like

void exportOrder(Order order, String fileName){...};

is not concerned where order came from - from DB, from web request, from unit test, etc. It makes this method more explicitly declare what exactly it requires, instead of taking DataRow and documenting which columns it expects to have and which types they should be. Same applies if you implement it somehow as stored procedure - you still need to push record id to it, while it not necessary should be present in DB.

Implementation of this method would be done based on Order abstraction, not based on how exactly it is presented in DB. Most of such operations which I implemented really do not depend on how this data is stored. I do understand that some operations require coupling with DB structure for perfomance and scalability purposes, just in my experience there are not too much of them. In my experience very often it is enough to know that Person has .getFirstName() returning String, and .getAddress() returning Address, and address has .getZipCode(), etc - and do not care which tables are involed to store that data.

If you have to deal with such problems as you described, like when additional column breaks report perfomance, then for your tasks DB is a critical part, and you indeed should be as close as possible to it. While entities can provide some convenient abstractions they can hide some important details as well.

Scalability is interesting point here - most of websites which require enormous scalability (like facebook, livejournal, flickr) tend to use DB-ascetic approach, when DB is used as rare as possible and scalability issues are solved by caching, especially by RAM usage. http://highscalability.com/ has some interesting articles on it.

Solution 14 - Sql

I would also like to add to Dan's answer that separating both models could enable your application to be run on different database servers or even database models.

Solution 15 - Sql

What if you need to scale your app by load balancing more than one web server? You could install the full app on all web servers, but a better solution is to have the web servers talk to an application server.

But if there aren't any entity objects, they won't have very much to talk about.

I'm not saying that you shouldn't write monoliths if its a simple, internal, short life application. But as soon as it gets moderately complex, or it should last a significant amount of time, you really need to think about a good design.

This saves time when it comes to maintaining it.

By splitting application logic from presentation logic and data access, and by passing DTOs between them, you decouple them. Allowing them to change independently.

Solution 16 - Sql

You might find this post on comp.object interesting.

I'm not claiming to agree or disagree but it's interesting and (I think) relevant to this topic.

Solution 17 - Sql

A question: How do you handle disconnected applications if all your business logic is trapped in the database?

In the type of Enterprise application I'm interested in, we have to deal with multiple sites, some of them must be able to function in a disconnected state.
If your business logic is encapsulated in a Domain layer that is simple to incorporate into various application types -say, as a dll- then I can build applications that are aware of the business rules and are able, when necessary, to apply them locally.

In keeping the Domain layer in stored procedures on the database you have to stick with a single type of application that needs a permanent line-of-sight to the database.

It's ok for a certain class of environments, but it certainly doesn't cover the whole spectrum of Enterprise applications.

Solution 18 - Sql

@jdecuyper, one maxim I repeat to myself often is "if your business logic is not in your database, it is only a recommendation". I think Paul Nielson said that in one of his books. Application layers and UI come and go, but data usually lives for a very long time.

How do I avoid entity objects? Stored procedures mostly. I also freely admit that business logic tends to reach through all layers in an application whether you intend it to or not. A certain amount of coupling is inherent and unavoidable.

Solution 19 - Sql

I have been thinking about this same thing a lot lately; I was a heavy user of CSLA for a while, and I love the purity of saying that "all of your business logic (or at least as much as is reasonably possible) is encapsulated in business entities".

I have seen the business entity model provide a lot of value in cases where the design of the database is different than the way you work with the data, which is the case in a lot of business software.

For example, the idea of a "customer" may consist of a main record in a Customer table, combined with all of the orders the customer has placed, as well as all the customer's employees and their contact information, and some of the properties of a customer and its children may be determined from lookup tables. It's really nice from a development standpoint to be able to work with the Customer as a single entity, since from a business perspective, the concept of Customer contains all of these things, and the relationships may or may not be enforced in the database.

While I appreciate the quote that "if your business rule is not in your database, it's only a suggestion", I also believe that you shouldn't design the database to enforce business rules, you should design it to be efficient, fast and normalized.

That said, as others have noted above, there is no "perfect design", the tool has to fit the job. But using business entities can really help with maintenance and productivity, since you know where to go to modify business logic, and objects can model real-world concepts in an intuitive way.

Solution 20 - Sql

Eric,

No one is stopping you from choosing the framework/approach that you would wish. If you are going to go the "data driven/stored procedure-powered" path, then by all means, go for it! Especially if it really, really helps you deliver your applications on-spec and on-time.

The caveat being (a flipside to your question that is), ALL of your business rules should be on stored procedures, and your application is nothing more than a thin client.

That being said, same rules apply if you do your application in OOP : be consistent. Follow OOP's tenets, and that includes creating entity objects to represent your domain models.

The only real rule here is the word consistency. Nobody is stopping you from going DB-centric. No one is stopping you from doing old-school structured (aka, functional/procedural) programs. Hell, no one is stopping anybody from doing COBOL-style code. BUT an application has to be very, very consistent once going down this path, if it wishes to attain any degree of success.

Solution 21 - Sql

I'm really not sure what you consider "Enterprise Applications". But I'm getting the impression you are defining it as an Internal Application where the RDBMS would be set in stone and the system wouldn't have to be interoperable with any other systems whether internal or external.

But what if you had a database with 100 tables which equate to 4 Stored Procedures for each table just for basic CRUD operations that's 400 stored procedures which need to be maintained and aren't strongly-typed so are susceptible to typos nor can be Unit Tested. What happens when you get a new CTO who is an Open Source Evangelist and wants to change the RDBMS from SQL Server to MySql?

A lot of software today whether Enterprise Applications or Products are using SOA and have some requirements for exposing Web Services, at least the software I am and have been involved with do. Using your approach you would end up exposing a Serialized DataTable or DataRows. Now this may be deemed acceptable if the Client is guaranteed to be .NET and on an internal network. But when the Client is not known then you should be striving to Design an API which is intuitive and in most cases you would not want to be exposing the Full Database schema. I certainly wouldn't want to explain to a Java developer what a DataTable is and how to use it. There's also the consideration of Bandwith and payload size and serialized DataTables, DataSets are very heavy.

There is no silver bullet with software design and it really depends on where the priorities lie, for me it's in Unit Testable code and loosely coupled components that can be easily consumed be any client.

just my 2 cents

Solution 22 - Sql

I'd like to offer another angle to the problem of distance between OO and RDB: history.

Any software has a model of reality that is to some degree an abstraction of reality. No computer program can capture all the complexities of reality, and programs are written just to solve a set of problems from reality. Therefore any software model is a reduction of reality. Sometimes the software model forces reality to reduce itself. Like when you want the car rental company to reserve any car for you as long as it is blue and has alloys, but the operator can't comply because your request won't fit in the computer.

RDB comes from a very old tradition of putting information into tables, called accounting. Accounting was done on paper, then on punch cards, then in computers. But accounting is already a reduction of reality. Accounting has forced people to follow its system so long that it has become accepted reality. That's why it is relatively easy to make computer software for accounting, accounting has had its information model, long before the computer came along.

Given the importance of good accounting systems, and the acceptance you get from any business managers, these systems have become very advanced. The database foundations are now very solid and noone hesitates about keeping vital data in something so trustworthy.

I guess that OO must have come along when people have found that other aspects of reality are harder to model than accounting (which is already a model). OO has become a very successful idea, but persistance of OO data is relatively underdeveloped. RDB/Accounting has had easy wins, but OO is a much larger field (basically everything that isn't accounting).

So many of us have wanted to use OO but we still want safe storage of our data. What can be safer than to store our data the same way as the esteemed accounting system does? It is an enticing prospects, but we all run into the same pitfalls. Very few have taken the trouble to think of OO persistence compared to the massive efforts by the RDB industry, who has had the benefit of accounting's tradition and position.

Prevayler and db4o are some suggestions, I'm sure there are others I haven't heard of, but none have seemed to get half the press as, say, hibernation.

Storing your objects in good old files doesn't even seem to be taken seriously for multiuser applications, and especially web applications.

In my everyday struggle to close the chasm between OO and RDB I use OO as much as possible but try to keep inheritance to a minimum. I don't often use SPs. I'll use the advanced query stuff only in aspects that look like accounting.

I'll be happily supprised when the chasm is closed for good. I think the solution will come when Oracle launches something like "Oracle Object Instance Base". To really catch on, it will have to have a reassuring name.

Solution 23 - Sql

Not a lot of time at the moment, but just off the top of my head...

The entity model lets you give a consistent interface to the database (and other possible systems) even beyond what a stored procedure interface can do. By using enterprise-wide business models you can make sure that all applications affect the data consistently which is a VERY important thing. Otherwise you end up with bad data, which is just plain evil.

If you only have one application then you don't really have an "enterprise" system, regardless of how big that application or your data are. In that case you can use an approach similar to what you talk about. Just be aware of the work that will be needed if you decide to grow your systems in the future.

Here are a few things that you should keep in mind (IMO) though:

  1. Generated SQL code is bad (exceptions to follow). Sorry, I know that a lot of people think that it's a huge time saver, but I've never found a system that could generate more efficient code than what I could write and often the code is just plain horrible. You also often end up generating a ton of SQL code that never gets used. The exception here is very simple patterns, like maybe lookup tables. A lot of people get carried away on it though.
  2. Entities <> Tables (or even logical data model entities necessarily). A data model often has data rules that should be enforced as closely to the database as possible which can include rules around how table rows relate to each other or other similar rules that are too complex for declarative RI. These should be handled in stored procedures. If all of your stored procedures are simple CRUD procs, you can't do that. On top of that, the CRUD model usually creates performance issues because it doesn't minimize round trips across the network to the database. That's often the biggest bottleneck in an enterprise application.

Solution 24 - Sql

Sometimes, your application and data layer are not that tightly coupled. For example, you may have a telephone billing application. You later create a separate application which monitors phone usage to a) better advertise to you b) optimise your phone plan.

These applications have different concerns and data requirements (even the data is coming out of the same database), they would drive different designs. Your code base can end up an absolute mess (in either application) and a nightmare to maintain if you let the database drive the code.

Solution 25 - Sql

Applications that have domain logic separated from the data storage logic are adaptable to any kind of data source (database or otherwise) or UI (web or windows(or linux etc.)) application.

Your pretty much stuck in your database, which isn't bad if your with a company who is satisfied with the current database system your using. However, because databases evolve overtime there might be a new database system that is really neat and new that your company wants to use. What if they wanted to switch to a web services method of data access (like Service Orientated architecture sometime does). You might have to port your stored procedures all over the place.

Also the domain logic abstracts away the UI, which can be more important in large complex systems that have ever evolving UIs (especially when they are constantly searching for more customers).

Also, while I agree that there is no definitive answer to the question of stored procedures versus domain logic. I'm in the domain logic camp (and I think they are winning over time), because I believe that elaborate stored procedures are harder to maintain than elaborate domain logic. But that's a whole other debate

Solution 26 - Sql

I think that you are just used to writing a specific kind of application, and solving a certain kind of problem. You seem to be attacking this from a "database first" perspective. There are lots of developers out there where data is persisted to a DB but performance is not a top priority. In lots of cases putting an abstraction over the persistence layer simplifies code greatly and the performance cost is a non-issue.

Whatever you are doing, it's not OOP. It's not wrong, it's just not OOP, and it doesn't make sense to apply your solutions to every othe problem out there.

Solution 27 - Sql

Interesting question. A couple thoughts:

  1. How would you unit test if all of your business logic was in your database?
  2. Wouldn't changes to your database structure, specifically ones that affect several pages in your app, be a major hassle to change throughout the app?

Solution 28 - Sql

Good Question!

One approach I rather like is to create an iterator/generator object that emits instances of objects that are relevant to a specific context. Usually this object wraps some underlying database access stuff, but I don't need to know that when using it.

For example,

> An AnswerIterator object generates AnswerIterator.Answer objects. Under the hood it's iterating over a SQL Statement to fetch all the answers, and another SQL statement to fetch all related comments. But when using the iterator I just use the Answer object that has the minimum properties for this context. With a little bit of skeleton code this becomes almost trivial to do.

I've found that this works well when I have a huge dataset to work on, and when done right, it gives me small, transient objects that are relatively easy to test.

It's basically a thin veneer over the Database Access stuff, but it still gives me the flexibility of abstracting it when I need to.

Solution 29 - Sql

The objects in my apps tend to relate one-to-one to the database, but I'm finding using Linq To Sql rather than sprocs makes it much easier writing complicated queries, especially being able to build them up using the deferred execution. e.g. from r in Images.User.Ratings where etc. This saves me trying to work out several join statements in sql, and having Skip & Take for paging also simplifies the code rather than having to embed the row_number & 'over' code.

Solution 30 - Sql

Why stop at entity objects? If you don't see the value with entity objects in an enterprise level app, then just do your data access in a purely functional/procedural language and wire it up to a UI. Why not just cut out all the OO "fluff"?

Solution 31 - Sql

I've been speculating whether relational databases driven by SQL aren't a bit at cross-purposes with these frameworks that use the ActiveRecord paradigm. One fundamental problem is that AR (and good OO design, for that matter), drive us to decompose logic; and SQL simply isn't amenable to statement decomposition.

I wonder if using an isam persistence model for the database wouldn't be a better idea; a better impedance match to OO; more agreement on the basic idea of data as tables; more consistent with the conventional artifacts of OO persistence. One good example is that FKs and their associations can be more explicit.

RoR has a rep for being a database slug, and I suspect this issue is a large part of the reason.

Has anyone tried to use an isam database for an ActiveRecord implementation?

Solution 32 - Sql

I'm puzzled about the "lock your database in stone" argument in favor of stored procs. I can take my ActiveRecord model and move it from MySQL to Postgres to SQLite, thank you very much. I couldn't do that with anything stored proc-based unless I wanted to rewrite them all.

I assume you mean that you're locking your database schema in stone. That argument is more interesting. To some extent I think it's argued from the perspective of an application with minimal unit tests and code coverage - the applications where you don't change your code out of sheer fear you're going to break "something."

My experience with stored-proc-based systems is minimal though. I'm curious, in large applications, how do you manage all of the data relations? On one page I show a product with a picture. On another page I show a product and the user who created it. On another page I show a product and the comments about it. On another page I need to show that product with no picture joined with a table of specifications about it.... etc. etc. I have a data model with a lot of relationships. I assume you don't write a stored proc for every combination? The DRY principle is the one that I worry about. How many queries am I writing where I'm re-left-joining (effectively re-coding) my relationships? And, while we're talking about locking the schema, how many stored procs am I going to need to re-write?

Solution 33 - Sql

I think Entity objects are over emphasized in enterprise solution nowadays. They cannot contain business layer functions, since those belong in the Services in the service layer, or UI layer for UI specific functions, etc. Entity objects do allow the designers to think better in terms of designing the application well, but they do not necessarily have to contain all the application logic in them. They can be dumb objects that follow certain rules and interfaces and can be used to build other layers on top of them and act as data carriers between the layers.

Solution 34 - Sql

I don't see what entity objects have to do with scalability, you're probably talking about using ORM tools, in this case I agree with you.

I'm very interested in scalability. Entity objects are never in your way of building a highly scalable application but you have to do it the right way, in other words you need a hand-written DAL, as opposed to a DAL generated using some ORM. Actually this is why I don't like ORMs, there's nothing that beats a hand-written DAL, I also don't use LINQ as I read in many places that it has a big overhead. I tweak every query in my apps and create the needed indexes, I don't let some ORM generate the code for me.

I don't agree with you that Entity objects make the code harder to maintain, actually the whole purpose of this architecture is to make it to easier to maintain and modify your code and this is what I see in practice, I wrote spaghetti code for a long time (didn't use 3-tier or n-tier architectures) so I know what I'm talking about.

Also Entity objects are needed for caching, I wonder how you cache the data in your applications if you don't use Entity objects, do you use datasets or datatables?

Solution 35 - Sql

To be honest, I think if you can get away with data over forms, go for it! But the minute things get sticky, you would be wise to learn how to strucure things to gain some simplicity.

I haven't read all the answers but common points thigns get sticky:

  • Code is repeated Buggy, unstable code
  • HUGE classes loaded with static
    classes
  • Logic is everywhere and
    anywhere ( aspx, static methods, sql, triggers )
  • Interacting with multiple
    objects, sharing common features will proove difficult

As far as domain vs data. I think Data will always win, functionality is ALL that matters to the client. It has to work. I'm a proponent of refactoring when you can if you break a principle to deliver something that works on time.. you can always go back and refactor.

Also a quick word on debugger, complex domain. I have seem many people get scared because they hit interfaces, don't understand all the acrobatics that are possible in very advanced OOP/polymorphic code. I TOTALLY understand, sometimes you can get lost and deterred. This is why they make tools.. I'm less scared of a solution with 1000 files than a humongous method with 1000 lines. And I have seen both believe it or not.

There is a happy medium also if your'e willing to write tests you won't worry so much about the debugger and steppign through code. If you get good tools and find a balance you'll solve all the problems above and also keep things simple enough to get around.

Solution 36 - Sql

Well, I want to thank you for a fascinating discussion. I'm working my way through Stephen Walther's ASP.NET MVC Framework Unleashed, and I'm enjoying it as a sort of philosophical exercise, but I'm somewhat aghast at the amount of plumbing code his approach entails. Now that's not inherent in using an ORM -- Rails prides itself on freeing you from such housekeeping matters, but I'm really wrestling with whether I think it's worth it to have to write and maintain a separate Record class that can be used by the application and an EntityRecord class that maps the Record class to the database.

His gloss on the benefits are that you end up with a testable application where the tests run quickly, but frankly I'd rather trade some testing speed for executing code that's actually in the application. I think by the time you're spending your day slogging along and copying properties around so that your tests can run quickly, the testing tail has begun to wag the programmer dog -- who'd rather be chasing rabbits or having a nap in front of the fire.

The second cited benefit is that you can take your application and run it on a different database. Yeah, OK, maybe if you're writing something like a SalesForce for resale or something, that might be a goal, but for 90% or more of the applications out there, so what? I'm reminded of the neighbor in "It's a Wonderful Life" who gave George a jar of money and said: "I was saving this for a divorce in case I ever got a husband." Don't write it till you need it.

On the other hand, I do have a practical objection to stored procedures. It's not necessarily inherent in their use but more a feature of some of the brain-dead shops I've worked in: they sometimes put a DBA in the way of the code I want to write. I like to think I'm not a cowboy, but on the opposite end I don't like to have to convene a UN committee to add a field to a table.

Solution 37 - Sql

One question: what if your data source were a web service? I write applications using only distributed data via web services. Am I expected to write that using a different paradigm than if my data source were an RDBMS?

I'm not asking what do you do if you switch from RDBMS to web services (because, in an internal shop, that's unlikely), I'm asking what do you do when the data comes from web services from the start?

Is your programming model drastically different than if it'd have been an RDBMS? If it is, you need to consider maintainability. My developers would have an awful time, if every app they jump into is programmed using different paradigms.

Solution 38 - Sql

Some logic such as operations related to sets tend to be better represented in stored procedures. Yet there are times when an algorithm that has many branches and conditions is best represented in programming code. A grammar used for parsing commands that supports a runtime function for scripting actions can not be implemented in stored procedures.

The one weakness I see with stored procedures is that you tend to get a new stored procedure for new list or grid in the application. Or worse, one stored proc to rule them all, 10 parameters and case statements to further define them. In addition, the stored proc's become HUGE and even more difficult to debug.

All that said, I'm with you that an ORM may get in the way many times for the reasons you sited. In the end, it boils down to how you rule the technology.

Solution 39 - Sql

I have just recently stumbled upon this question. Realizing that this question has been pretty old, and that there are many answers, I understand that my response many not be looked at even once. Still, I would like to leave my comments here.

I would look at this question in three aspects. But before that, I have to state: 8 out of 10, a programmer coming from the imperative/OO-design world (C/C++, JAVA, C#, etc.) does not know how to write optimized, efficient SQL code. From my experience, it is rare to have someone who can do well at both application development and SQL development.

With that said, I would like to give three aspects for looking at this question.

First: Seperation of concern not according to program, but organizational hierarchy.

Frankly, there are many kinds of "enterprise" in this world, and each one has its own organizational hierarchy, varied by history and philosophy. In one particular company I have worked with, the programmers cannot modify or develop upon the database. They can read and consume the database API (i.e. stored procedure in SQL server), but not directly reading the database, and cannot write any query or stored procedure.

Any database request (data or functionality) has to be delegated to another role: Data Architect. S/he would be the one dealing with the development, and possibly the maintenance, of the database. (Although, maintenance part should be the job of DB Admin.) In such environment, the stored procedure is only consumable; even the source of the stored procedure in the PROD environment would be encrypted, and programmers are not allowed to see the SP's source.

However, in some other organizations, programmers are expect to do development of all aspects, including the interface, middleware and data storage. This is the majority case.

In these two scenarios (albeit the first one is rather extreme but real), it would affect how you view the author's question. In the first case, I would say the author would agree upon the Data Architect's role, but any non-database programmer in the organization would greatly despise. In the second case, however, because of my previous disclaimer about many developers not knowing how to write good SQL codes (and generally not liking to deal with it either), it is only natural to opt for the simpler approach: ORM.

Second: The role of database: pure data storage up to different interpretations, or provider of predefined schemes of information?

Definition: "data" is raw, while "information" is interpreted.

In many real-world situations, the database is only regarded as a pure data storage. It may contain data logic (e.g. integrity of relational data), but it does not contain business logic (e.g. any formula applied to the data not because of the data's nature, but because of this is how this particular section of business works).

In the aforementioned organization I have worked with, in one database, it stores various financial information of a customer. At first, there is only one formula to calculate an index regarding the customer's financial health, and this formula, along with the customer's status based on the formula, is stored within a stored procedure of the database. As the government kept changing rules in the past few years, however, many more formulas have been created to accommodate with the government.

Hence the problem: each branch in the organization, with its own distinct programming department (and little inter-organizational business between each branch), uses the same set of financial data, but with different formulas and operations.

In this case, the original model of storing the formula in the database brought a maintenance and office politics hell. At first, the Data Architect would create typed stored procedures to accommodate the changes, but soon the organization started to have trouble with this model. The HQ had determined that, each branch would maintain its own set of formulas, and nobody except that branch should know the owned formulas. The Data Architect, in this case, knew all the formulas, and that did not sit well with the HQ's policy. The quick pace in changing the formulas has also brought efficiency problem for testing between each branch, because every formula tweak had to go through the Data Architect.

In this case, the organization faces a rather profound question: should the database serve the interpreted information, or should it only serve the data without any meaning?

That is a good way to jump into the third aspect.

Third: Ideological warfare: single- vs multi-purpose, and monolithic vs modular?

The aforementioned example is a clear demonstration of data being used in a multipurpose fashion. The data, while remaining the same for every branch, had different interpretation and usage in different scenarios. And here is my take.

Does your database store and serve data that, in nature, has multipurpose, and performance is not a big concern?

If yes, then I would say, the database should be reduced to only serving the data, and any logic not related to data integrity should be stored somewhere else. This is more of a modular approach: others can plug whatever operations and interpretations they would like to have, but not in SQL.

If any part of the question is a negative (i.e. it's single purpose, or performance is a big big concern), and assuming that no office politics are in the way, then I would say, monolithic approach of putting the majority stuff into the database is fine. Preferred or not, that becomes an ideological choice.

I have the impression that the author, upon writing and editing the question, supports the opinion of monolithic approach. I do consider this case-by-case, but generally, I am in such approach:

  • Simple CRUD and nothing else: ORM
  • Formulas and Workflows based on data: middleware (like CSLA), not in database (unless performance is a concern)
  • Reporting: definitely in database (for performance reason)

Above is my 2 cents.

Solution 40 - Sql

Software that solves a large problem in a very generic way applicable to lots of real situations necessarily comes with a performance cost in and of itself. It takes code to handle all that genericity and code takes time to run.

Also, descending through a layer of abstraction always reveals that some things cost a little and some things cost a lot and those differences were hidden by the abstraction. The isolation the abstraction gives the developer from what ever is beneath always causes the developer to casually introduce more expensive operations than were necessary.

Whatever else can be said about this question, from a performance only perspective and from a scaling performance only perspective avoiding the double whammy caused by the extra isolation from the realities of ones own database is going to pay off in performance.

I currently spend my working days battling exactly the performance problems caused by these issues and they are terrible monsters to fight.

Solution 41 - Sql

Each approach has strengths. Their appropriateness for a particular problem must be judged on a case-by-case basis.

I wholeheartedly believe that entity- (and hence object oriented-) designs simplify otherwise complex business logic, as others have noted. But in my opinion, the greatest strength of an entity-based designs is modularity via well-defined inputs and outputs, which is easier to achieve outside of the database and with an object oriented model. I'll elaborate below.


I'm a Linux user. One point in the Unix philosophy is that developers should "write programs to handle text streams, because that is a universal interface". This is true of Linux, because it is very text-centric. You can chain unrelated processes together to achieve something new, like grep ^col /var/log/bim-sync | sed 's/.*alt:\([0-9]\{1,\}\).*/\1/g | xargs -I replstr bim-transcoder replstr. These programs are completely ignorant of each other, and yet can be easily joined together to achieve a new purpose. The reason this is possible is because you (the author of the "glue") know the input and output formats of each process.

Now, I don't believe that text streams are appropriate everywhere. Text streams are common, but not universal. My development philosophy is to "write programs with well-defined inputs and outputs". The input/output I'm talking about here is not necessarily standard input/output and not necessarily textual - it could be the arguments to a command line program, raw bytes sent over a network socket, an in memory data structure passed between layers of code, etc, etc. Thinking about software in terms of ye old Input-Process-Output "black boxes" allows you to compose your application like the command-line utilities in Unix - independently with a thin layer of glue that joins them together.

As an example, say you're writing software for the Australian New South Wale's Births, Deaths and Marriages. When a registration of a child's birth comes in, an operator enters the details, scans the signed forms and hits the Submit button. Hitting the Submit button issues a RegisterBirth command. The software validates the details of the command (date and time of birth, hospital, etc), and emits a BirthRegistered event. This event includes many details on the birth, such as the delivering doctor, whether it was a natural birth or by C-section, if it was a C-section, whether it was an emergency, who the biological parents are, etc, etc. A lot of different code can "plug into" this event. For example, one piece of code could issue a simple insert into person... SQL statement, while another piece of code could issue a series Neo4j cypher commands to store the the newborn baby and the relationship to its biological parents. This second piece of code would allow an extremely fast query of hierarchical "family tree" data. Such queries would be slower (and more complicated) in an SQL database, regardless of whether you use adjacency lists or nested sets. Still another piece of code could update the statistics on the number of emergency C-sections that month, which is stored in an XML file for historical reasons.

The modularity doesn't stop with abstracting the persistence mechanism. For example, you could write a FastCGI "glue layer" to web-enable your application: an "input" HTTP request is accepted by your web server, which emits a "output" FastCGI request to your "glue layer". Your FastCGI "glue layer" accepts this as input and transforms it into an output form appropriate to your application. Your application accepts the input command and emits events or errors, which can be picked up by other "glue layers" (such as the SQL and Neo4j examples given above).

The modularity can continue in almost any direction. You could have a command-line interface, or a GUI interface. You could create a comprehensive, automated test suite. You could open your application up to being scripted by third parties. A lot of the concepts here relate to Domain Driven Design, Command Query Responsibility Segregation and Event Sourcing, three inter-related patterns which I've found to be incredible powerful.

When using an entity-based approach, there are many relates architectural. For example, there is the Onion Architecture by Jeffrey Palermo and Ports and Adapters (or Hexagonal Architecute) by Alistair Cockburn. What all these architectures have in common is modularity and abstraction via defined inputs and outputs, regardless of whether those input/output boundaries are within a single program, or whether they span many processes and even networks.

An entity-based approach provides modularity and flexibility. However, there are downsides to this approach, three of which are significant:

  1. Firstly, the initial investment is high. This means that such an approach doesn't make sense for projects that are small in scope.

  2. Secondly, the amount of glue code you have to write can become large. Writing glue code can be tedious, but it can also be rewarding. For example, say your application integrates loosely with PostgreSQL as it's storage backend. When the company directors decide that the application should support Microsoft SQL Server, it's very satisfying (and builds team morale) when the goal is reached before the due date and below budget.

  3. Thirdly, my experience has taught me that an entity-based approach can be worse than a simple SQL solution, depending on the expertise of those implementing it. E.g., if the entity-first approach is full of getters and setters which are nothing more than in-memory representations of the database tables, you can be sure that the problem has not been thought out. It's cases like these which understandably leave developers wondering "Why don't we just write SQL?"


References:

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEric Z BeardView Question on Stackoverflow
Solution 1 - SqlKristopher JohnsonView Answer on Stackoverflow
Solution 2 - SqlnachojammersView Answer on Stackoverflow
Solution 3 - SqlDanView Answer on Stackoverflow
Solution 4 - SqlWebjediView Answer on Stackoverflow
Solution 5 - SqlNotMeView Answer on Stackoverflow
Solution 6 - SqlJulio CésarView Answer on Stackoverflow
Solution 7 - SqlAshView Answer on Stackoverflow
Solution 8 - SqlEric Z BeardView Answer on Stackoverflow
Solution 9 - SqljdecuyperView Answer on Stackoverflow
Solution 10 - SqlFlySwatView Answer on Stackoverflow
Solution 11 - SqlMark CidadeView Answer on Stackoverflow
Solution 12 - SqljbandiView Answer on Stackoverflow
Solution 13 - SqlPavel FeldmanView Answer on Stackoverflow
Solution 14 - SqlLars TruijensView Answer on Stackoverflow
Solution 15 - SqltgmdbmView Answer on Stackoverflow
Solution 16 - SqlHamish SmithView Answer on Stackoverflow
Solution 17 - SqlRenaud BompuisView Answer on Stackoverflow
Solution 18 - SqlEric Z BeardView Answer on Stackoverflow
Solution 19 - SqlGuy StarbuckView Answer on Stackoverflow
Solution 20 - SqlJon LimjapView Answer on Stackoverflow
Solution 21 - Sqluser17060View Answer on Stackoverflow
Solution 22 - SqlGugeView Answer on Stackoverflow
Solution 23 - SqlTom HView Answer on Stackoverflow
Solution 24 - SqlbillybobView Answer on Stackoverflow
Solution 25 - SqlMark RogersView Answer on Stackoverflow
Solution 26 - SqlOutlaw ProgrammerView Answer on Stackoverflow
Solution 27 - SqlKevin PangView Answer on Stackoverflow
Solution 28 - SqlAllain LalondeView Answer on Stackoverflow
Solution 29 - SqlpeterorumView Answer on Stackoverflow
Solution 30 - SqlPragmatic AgilistView Answer on Stackoverflow
Solution 31 - SqldkretzView Answer on Stackoverflow
Solution 32 - SqlTom LianzaView Answer on Stackoverflow
Solution 33 - SqlAhmadView Answer on Stackoverflow
Solution 34 - SqlWaleed EissaView Answer on Stackoverflow
Solution 35 - Sql5x1llzView Answer on Stackoverflow
Solution 36 - SqlJohn LockwoodView Answer on Stackoverflow
Solution 37 - SqlTravis HesemanView Answer on Stackoverflow
Solution 38 - SqlDavid RobbinsView Answer on Stackoverflow
Solution 39 - SqlTimeSpace TravellerView Answer on Stackoverflow
Solution 40 - Sqluser334911View Answer on Stackoverflow
Solution 41 - SqlmagnusView Answer on Stackoverflow