.Net vs SSIS: What should SSIS be used for?

.NetSql ServerSsis

.Net Problem Overview


If I have the option of using .Net and can do data transformations just fine in .Net, when would I need SSIS? Is there a certain task that SSIS would be better for? Are the added benefits of transparency worth it? Is it just what I am more comfortable with? What are the best practices for determining this?

.Net Solutions


Solution 1 - .Net

good question.

if the amount of data transfer huge? are you processing multiple data files and need transactions (both at file system level and database level)? are you dealing with multiple data sources at different locations (for eg ftp, local file system, database)?

if answers to above are yes then go ahead with ssis. basically .net is cool with small data import / export jobs, but when you have anything more complex, ssis is a definite winner

the other thing which i look at is - is it worth writing .net code when everything is available inside ssis. (dont mistake me - i love coding) however, anything you code, you need to maintain :-)

Solution 2 - .Net

I think project time/budget constraints and the use of a standard tool are some of the biggest arguments for using SSIS. Creating an SSIS package is most of the times way faster than trying to code something similar in .NET.

But with that said, it seems like SSIS have a lot of pain points that sometimes might invalidate this argument. It did for me when developing a solution that needed to run in different environments at many different clients. SSIS simply looked too painful the more I evaluated it for the project. A properly architected .NET solution is easier to deploy, more reliable, more flexible, easier to understand and can also achieve very good performance.

IMHO: consider using SSIS for projects that you only need to deploy to one or maybe two in house SQL Server environments. Otherwise, the .NET approach will quickly become more appealing.

Solution 3 - .Net

My arguments for not using SSIS are:

  • Design greenfield products so that they have RESTful data feeds for reporting and extraction built-in to the project plan and budget, preferably to a standard like OData so that other tools can plug right in.

  • Data feeds should pull and transform from upstream systems and feeds on demand; such that schedule tasks, configuration of scheduled tasks, task runner VMs and staff to run all this unreliable scheduling stuff is negated.

  • RESTful data feeds leverage HTTP caching.

  • Feeds/services/APIs can be moved to elastic-scale cloud easily.

  • SSIS requires finding people with SSIS skills that enjoy doing that stuff for weeks. In my experience, finding and retaining SSIS developers is hard and expensive and the people found tend to be sub-par.

  • SSIS doesn't work well with source control and collaborative work.

  • SSIS doesn't lend itself well to code reuse, unlike microservices and traditional code libraries.

  • SSIS doesn't version easily, unlike a REST service.

  • SSIS doesn't lend itself to modular designs and continuous deployment of many small changes, it tends to be large-batch with scary releases.

  • SSIS promotes the use of stored-procedures which places a lot of demand on SQL which is the hot-spot. Favour designs that place demands on a scaleable, stateless middle tier.

  • The tooling is clunky and unreliable.

  • You're at the mercy of Microsoft's roadmap for SSIS.

  • Consider writing to tables/services that support analysis, reporting and views as soon as the data comes into the application; see Event Sourcing and other application architecture patterns.

  • Never use Excel as a data source; train employees.

  • Code is king.

Ultimately, I see SSIS as a relic of Enterprise IT. I like to ask, "Would Google use SSIS?" How else can the problem be solved? Think outside the box.

Solution 4 - .Net

I guess it depends on what you are doing. SSIS is very powerful, just like old DTS. If you are loading lots of items and expect to have constant change, I would go SSIS all the way. If you are looking to load only a few items and it’s for lots of customers, I would put it in code. I prefer SSIS for in house ETL processes, but I use .Net at client shops when I need to load data from a legacy system into a SQL database. Now as I stated before if you have a lot of transformations and lots of different data silos to load, I think you would be crazy to do this in .Net and I would go SSIS. If you have only a few items to load and it’s for a single application and may be installed as part of an application at various clients, I would go .Net all the way. Just my 2 cents.

Solution 5 - .Net

I have a lot of experience with SSIS from small projects to large, complex ETL. Without going into the details, this is my guidance for you:

  • If you are a DBA and you are not familiar with .NET, or if you are a developer quite familiar with SSIS, then you can use SSIS for small, simple, fairly straightforward extract, transform, load (ETL) tasks.

  • SSIS is very quirky and there are many pitfalls, gotchas, and what might be considered outright bugs. It is extremely powerful if you are intimately familiar.

  • C# now has TPL Dataflow. Simple performance tests put it ahead of SSIS. (eg http://mymemoryleaks.blogspot.cz/2013/10/ssis-vs-tpldataflow.html)

  • If you want to do anything beyond trivial, and if you can use .NET skills, use .NET instead of SSIS.

Solution 6 - .Net

SSIS has many built in ways of doing transformations from different data sources and you can string them together in a way that makes it very customizeable. They have built in optimizations that make them fast.

You can also use .NET to make your own custom transformations to take advantage of the speed and repeatability of an SSIS job.

Solution 7 - .Net

I think main advantage is defining the entire programming construct visually. Any one look at the SSIS package is it pretty much self explainer. The tight integration with the SSIS with SQL allows you to be part of SQL for back up scheduling and huge plus.

As every one explained if you are doing the lot of data manipulation it is good tool. It is free if you have SQL you all set to go and very easy to learn with VS 2008 BIDS

Solution 8 - .Net

Bit late to answer this question but I hope it worth,

SSIS is often misunderstood when compared to programming language. SSIS is a framework whereas C# is a language on .NET Framework. I have extensive experience in handling & developing large data warehousing solution using (MSBI suite) and also had developed large websites (ASP.NET) - so I can't be biased.

SSIS if not used properly can decrease performance by par. SSIS packages have three kind of transformation:

  1. Blocking transformation - which can only pass the data when above transformation is completed fetching all the rows and completed required calculation on it.
  2. Semi-Blocking transformation - Which can pass partial data
  3. Non-Blocking - which process the row as soon as it is ready

SSIS works exceptionally good with non blocking transformation with proper setting on control flow and data flow. I have used it on larger (over 2 TB of data warehouse) and I can guarantee that it was the fastest load experience. You can check Microsoft blog about We Loaded 1TB in 30 Minutes with SSIS, and So Can You

I agree that SSIS degraded performance when dealing with blocking transformation and they should be carried by T-SQL whenever required.

Coming to C#, I accept that SSIS uses .NET framework and data provider to accomplish task. But C#, as a language is bit more logical and must be treated to deal with business logic. For example, If we have to run exe with different parameter based on condition, you can write a package which will consider parameters and then logically decide what parameter needs to be passed to run an exe file. It would be lengthy process to do that in SSIS while I can do that easily in C# because logical thing can be easily done in language instead of a framework.

Now the point here is what is more convenient approach to solve your problem statement. SSIS is a sure winner loading large amount of records loading data from source to destination while C# is perfect for writing logic. Even if you like C#, I won't recommend you to choose for doing ETL (Extract Transform Load) operation on large data warehouse systems.

Solution 9 - .Net

SSIS is generally used for ETL (Extract Transform Load). Specific use cases are the pre-processing of SSAS (SQL Server Analysis Services) cubes; and enhanced extraction using Data Change Capture.

It can do typical automation, including FTP, and email. There is the programming aspect using script-tasks (C# or Visual Basic), so SSIS has functionality beyond it's included controls...

Packages can be programmed to use conditional control-flow path. For example, do a certain task Monday thru Friday, and a different task Saturday & Sundays. Or refuse to perform ETL if certain conditions are not met.

SSIS packages can call other SSIS packages. That keeps the code modular, allowing re-use.

It can work with various Data Sources, and perform simple transformation using the Derived Column control. This is versus doing transformation on the source server (which could be Oracle or Hadoop for example- something you don't have control of with your local SQL Server).

Solution 10 - .Net

As the name suggests, SSIS is an integration system. It can be very difficult in .net to handle connectors to disparate data sources such as excel, teradata, oracle etc and also to live up to the responsibility to gracefully close those connections, garbage collection, handling memory issues.

So, SSIS is out of the box product perfect for scenarios where data not only needs to be pulled from, say, two different sources, but then a series lookups, transformations, merges, derivations and calculations need to be performed before writing it to a target location(be it sql server, a flat file or another db system).

SSIS also has checkpoints where, if the package fails due to any reason, it will pick up from where it left off (it needs to be configured as this is not default behavior).

In addition, SSIS will save you a lot of time because its tasks are reusable and its deployment process is fairly easy to implement and schedule, supported by great event handling.

Solution 11 - .Net

Basically SSIS has many advantages like splitting data transfer from point A to point B in smaller blocks and debug them in individually, able to access SQL Server Tables easily, work on XML data, API calls using c# scripts and saving data on DB, Read DB data and FTP on remote server and many more.
Apart from bunch of already existing BI blocks, you can also create your own customized tasks with its own parameters and outputs.
Hope I was able to add some points to the already existing answers.

Solution 12 - .Net

Day-to-Day Tasks , which are used by a SSIS Deveoper and are relatively easy as compared to .Net can include :

Data Comparison between the tables.

Conditional Splitting,data blocking the data on the basis of some logic.

Data Conversion,look up , merge , unionall , relatively easy to use.

File Handling(Modifying , validations).

Error Handing , Email Alerts.

Containers , FOR/FOReach loops are easy to use.

Posting data on web services is easy using the WebService task.

Checkpointing,Re-runablity of the data loads is easy to handle.

Debugging is easy in ssis - can be done on conatiner lever , package level.

Scripting can also be done , if the task is not available. Also , you can customize your own tasks

Solution 13 - .Net

Whatever folks say in previous answers are correct but I think that the most important aspect of using SSIS instead of coding is to have easy maintenance process and also a reusable product.

Solution 14 - .Net

SSIS is great for BI applications, you can manipulate the data on Stage Table and than make avaiable on DataWarehouse tables to be used for BI.

I can connect on SAP, Oracle to get employee information and make avaiable on PowerBI, QlikView, etc...

Its a nice tool if you know where and why use it. Use ir because its cool you will have troubles.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjleView Question on Stackoverflow
Solution 1 - .NetRajView Answer on Stackoverflow
Solution 2 - .NetLars FastrupView Answer on Stackoverflow
Solution 3 - .NetLuke PuplettView Answer on Stackoverflow
Solution 4 - .NetJojoView Answer on Stackoverflow
Solution 5 - .NetSentinelView Answer on Stackoverflow
Solution 6 - .NetJeff MartinView Answer on Stackoverflow
Solution 7 - .Netuser171523View Answer on Stackoverflow
Solution 8 - .NetAnuj TripathiView Answer on Stackoverflow
Solution 9 - .NetZorkolotView Answer on Stackoverflow
Solution 10 - .NetJ SidhuView Answer on Stackoverflow
Solution 11 - .NetSSharmaView Answer on Stackoverflow
Solution 12 - .NetSneh BhatView Answer on Stackoverflow
Solution 13 - .NetVahidSHView Answer on Stackoverflow
Solution 14 - .NetThiago JacomettiView Answer on Stackoverflow