What is the best place for storing uploaded images, SQL database or disk file system?

DatabaseFilesystemsImage UploadingBinaryfiles

Database Problem Overview


I'm writing an application that allows users to upload images onto the server. I expect about 20 images per day all jpeg and probably not edited/resized. (This is another question, how to resize the images on the server side before storing. Maybe someone can please drop a .NET resource for that in the comment or so). I wonder now what the best place for storing uploaded images is.

  • Store the images as a file in the file system and create a record in a table with the exact path to that image.

  • Or, store the image itself in a table using an "image" or "binary data" data type of the database server.

I see advantages and disadvantages in both. I like a) because I can easily relocate the files and just have to change the table entry. On the other hand I don't like storing business data on the web server and I don't really want to connect the web server to any other datasource that holds business data (for security reasons) I like b) because all the information is in one place and easily accessible by a query. On the other hand the database will get very big very soon. Outsourcing that data could be more difficult.

Database Solutions


Solution 1 - Database

I generally store files on the file-system, since that's what its there for, though there are exceptions. For files, the file-system is the most flexible and performant solution (usually).

There are a few problems with storing files on a database - files are generally much larger than your average row - result-sets containing many large files will consume a lot of memory. Also, if you use a storage engine that employs table-locks for writes (ISAM for example), your files table might be locked often depending on the size / rate of files you are storing there.

Regarding security - I usually store the files in a directory that is outside of the document root (not accessible through an http request) and serve them through a script that checks for the proper authorization first.

Solution 2 - Database

The only benefit for the option B is having all the data in one system, yet it's a false benefit! You may argue that your code is also a form of data, and therefore also can be stored in database - how would you like it?

Unless you have some unique case:

  • Business logic belongs in code.
  • Structured data belongs in database (relational or non-relational).
  • Bulk data belongs in storage (filesystem or other).

Files, Code, Data

It is not necessary to use filesystem to keep files. Instead you may use cloud storage (such as Amazon S3) or Infrastructure-as-a-service on top of it (such as Uploadcare):

https://uploadcare.com/upload-api-cloud-storage-and-cdn/

But storing files in the database is a bad idea.

Solution 3 - Database

I know this is an old post. But many visitors to this page are getting nothing related to the question. Especially for a newbie.

How to upload and store images or file in our website:

For a static website there maybe no problem since the file storage for some share hosting still adequate. The problem comes from a dynamic website when it gets bigger. Bigger in the database can be handled, but bigger in file such as images is becomes a problem. There are two type of images in a website:

  1. Images come from the administrator for dynamic blog. Usually, these images have been optimized before upload.

  2. Images from users in case of users is allowed to upload images such as avatar. Or users can create blog content and put some images from text editor. This kind of images is difficult to predict the size. Users can upload big images just for small content by resize the view size but not resize the image size.

By ignoring item no. 1 above, quick solution for item no. 2 can be temporary solved by the following tips if we don't have image optimizer functionality in our website :

  1. Do not allow users to directly upload from text editor by redirecting them to image gallery. On this page users must upload file in advance before they can embedded in the content. This method is called as a File Manager.

  2. Use a crop image function for users to upload images. This will limit the image size even users upload very big file. The final image is the result of the cropped image. We can define the size in server side and accept only for example 500Kb or lower.

Now, that is only temporary. For final solution, the question is repeated :

  • How to handle a big images storage?
  • Resize or change the extension.
  • How a big or medium website or e-commerce handle the file storage for their images?

What we can do then :

  1. Migrate from share hosting VPS. Not enough? Then more higher by upgrading to Dedicated.

  2. Create your own server for file storage. Googling to do it. This is not as difficult as you think. Some people do it for their website.

  3. The easy way is use the CDN file storage service.

Okay, 1 and 2 is little bit expensive. But no 3 I think is the best solution.

Some CDN services allow you to store as many web files as you want.

Question, "how to upload file to CDN from our website?"

Don't worry, once you register, usually free, you will get guidance how to upload file and get their link from/to your website. You will get an API and more. It's easy.

Some providers give us a free service for 14 days with limited storage and bandwidth. But that will be okay for starting point. The only problem is because 'people never try'.

Hope it will help for newbie.

Solution 4 - Database

We have had clients insist on option B (database storage) a few times on a few different backends, and we always ended up going back to option A (filesystem storage) eventually.

Large BLOBs like that just have not been handled well enough even by SQL Server 2005, which is the latest one we tried it on.

Specifically, we saw serious bloat and I think maybe locking problems.

One other note: if you are using NTFS based storage (windows server, etc) you might consider finding a way around putting thousands and thousands of files in one directory. I am not sure why, but sometimes the file system does not cope well with that situation. If anyone knows more about this I would love to hear it.

But I always try to use subdirectories to break things up a bit. Creation date often works well for this:

Images/2008/12/17/.jpg

...This provides a decent level of separation, and also helps a bit during debugging. Explorer and FTP clients alike can choke a bit when there are truly huge directories.

EDIT: Just a quick note for 2017, in more recent versions of SQL Server, there are new options for handling lots of BLOBs that are supposed to avoid the drawbacks I discussed.

EDIT: Quick note for 2020, Blob Storage in AWS/Azure/etc has also been an option for years now. This is a great fit for many web-based projects since it's cheap and it can often simplify certain issues around deployment, scaling to multiple servers, debugging other environments when necessary, etc.

Solution 5 - Database

I have recently created a PHP/MySQL app which stores PDFs/Word files in a MySQL table (as big as 40MB per file so far).

Pros:

  • Uploaded files are replicated to backup server along with everything else, no separate backup strategy is needed (peace of mind).
  • Setting up the web server is slightly simpler because I don't need to have an uploads/ folder and tell all my applications where it is.
  • I get to use transactions for edits to improve data integrity - I don't have to worry about orphaned and missing files

Cons:

  • mysqldump now takes a looooong time because there is 500MB of file data in one of the tables.
  • Overall not very memory/cpu efficient when compared to filesystem

I'd call my implementation a success, it takes care of backup requirements and simplifies the layout of the project. The performance is fine for the 20-30 people who use the app.

Solution 6 - Database

Definitely resize the image, and check it's format if you can. There have been cases of malicious files being uploaded and served by unwitting hosts- for instance, the http://hackaday.com/2008/08/04/the-gifar-image-vulnerability/">GIFAR</a> vulnerability allowed you to hide a malicious java applet in a GIF file, which would then be able to read cookies in the current context and send them to another site for a cross-site scripting attack. Resizing the images usually prevents this, as it munges the embedded code. While this attack has been fixed by JVM patches, naively serving up binary files without scrubbing them opens you up to a whole range of vulnerabilities.

Remember, most virus scanners can only run against the filesystem- if you store your binaries in the DB, you won't be able to run a scanner against them very easily.

Solution 7 - Database

This is basically I do.

  1. Store an uploaded image in temporary directory or memory.
  2. Process that image before permanently storing it. 2.1. Color corrections 2.2. Compress 2.3. Create several copies based on image dimensions 2.4. Rename with .xl, .lg, .md, .sm etc. suffixes
  3. Pack all processed image files (from a single file) inside a folder with folder name as id which will be stored in database for any row/document along with image file name (or may be random name as image name).
  4. Create yyyy/mm/d path folder if doesn't exist. For example 2016/08/21. Remember that path and store in database for same document and row.
  5. Move image id folder to path folder. (Path folder may be located in /var/web-content folder.)
  6. Flush memory buffer or delete temporary file.

When you need to access any image mentioned in a document, you have the path and id of the folder than contains images. For example /var/web-content/{{path}}/{{id}}/image-file-name.sm.jpg

This way if you have to delete all processed image files, just delete the folder and it's content recursively.

Solution 8 - Database

I use uploaded images on my website and I would definitely say option a).

One other thing I'd highly recommend is immediately changing the file name from what the user has named the photo, to something more manageable. For example something with the date and time to uniquely identify each picture.

It also helps to strip the user's file name of any strange characters to avoid future complications.

Solution 9 - Database

There's sort of a hybrid approach in SQL Server 2008 called the filestream datatype that was talked about on RunAs Radio #74, which is sort of like the best of both worlds. Most people don't have the 2008 otion, but if you do, this option looks pretty cool

Solution 10 - Database

Most implementations are option A.

With option B, you open a whole big can of whoop4ss when you marshall those bits from the database into something that can be displayed on a browser... Also, if the db is down, the images are not available.

I don't think that space is too much of an issue... Terabyte drives are a couple hundred bucks now.

We are implementing with option A because we don't have the time or resources to do option B.

Solution 11 - Database

For auto resizing, try imagemagick... it is used for many major open source content/photo management systems... and I believe that there are some .net extensions for it.

Solution 12 - Database

We use A. I would put it on a shared drive (unless you don't plan on running more than one server).

If the time comes when this won't scale for you then you can investigate caching mechanisms.

Solution 13 - Database

Absolutely, positively option A. Others have mentioned that databases generally don't deal well with BLOBs, whether they're designed to do so or not. Filesystems, on the other hand, live for this stuff. You have the option of using RAID striping, spreading images across multiple drives, even spreading them across geographically disparate servers.

Another advantage is your database backups/replication would be monstrous.

Solution 14 - Database

Option A.

Once the image is loaded you can verify the format and resize it before saving. There a number of .Net code samples to resize images on http://www.codeproject.com. For instance: http://www.codeproject.com/KB/cs/Photo_Resize.aspx

Solution 15 - Database

For security reasons, it is also best practise to avoid problems caused by IE's Content Sniffing which can allow attackers to upload JavaScript inside image files, which might get executed in the context of your site. So you might want to transform the images (crop/resize them) somehow before storing them to prevent this sort of attack. This answer has some other ideas.

Solution 16 - Database

Well, I have a similar project where users upload files onto the server. Under my point of view, option a) is the best solution due to it's more flexible. What you must do is storing images in a protected folder classified by subdirectories. The main directory must be set up by the administrator as the content must no run scripts (very important) and (read, write) protected for not be accesible in http request.

I hope this helps you.

Solution 17 - Database

If they are small files that will not need to be edited then option B is not a bad option. I prefer this to writing logic to store files and deal with crazy directory structure issues. Having a lot of files in one directory is bad. emkay?

If the files are large or require constant editing, especially from programs like office, then option A is your best bet.

For most cases, it's a matter of preference, but if you go option A, just make re the directories don't have too many files in them. If you choose option B, then make the table with the BLOBed data be in it's own database and/or file group. This will help with maintenance, especially backups/restores. Your regular data is probably fairly small, while your image data will be huge over time.

Solution 18 - Database

It depends on your requirements, specially volume, users and frequency of search. But, for small or medium office, the best option is to use an application like Apple Photos or Adobe Lighroom. They are specialized to store, catalog, index, and organize this kind of resource. But, for large organizations, with strong requirements of storage and high number of users, it is recommend instantiate an Content Management plataform with a Digital Asset Management, like Nuxeo or Alfresco; both offers very good resources do manage very large volumes of data with simplified methods to retrive them. And, very important: there is an free (open source) option for both platforms.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTobiasView Question on Stackoverflow
Solution 1 - DatabaseEran GalperinView Answer on Stackoverflow
Solution 2 - DatabaseDavid AvsajanishviliView Answer on Stackoverflow
Solution 3 - DatabaseSulung NugrohoView Answer on Stackoverflow
Solution 4 - DatabaseBrian MacKayView Answer on Stackoverflow
Solution 5 - Databasetoo much phpView Answer on Stackoverflow
Solution 6 - DatabaseTim HowlandView Answer on Stackoverflow
Solution 7 - DatabaseUday HiwaraleView Answer on Stackoverflow
Solution 8 - DatabasebarfoonView Answer on Stackoverflow
Solution 9 - DatabaseCharles GrahamView Answer on Stackoverflow
Solution 10 - DatabasemsonView Answer on Stackoverflow
Solution 11 - DatabasejleView Answer on Stackoverflow
Solution 12 - DatabasecsextonView Answer on Stackoverflow
Solution 13 - Databasedj_segfaultView Answer on Stackoverflow
Solution 14 - DatabaseLeahView Answer on Stackoverflow
Solution 15 - DatabaseDayView Answer on Stackoverflow
Solution 16 - DatabasedomoindalView Answer on Stackoverflow
Solution 17 - DatabaseCharles GrahamView Answer on Stackoverflow
Solution 18 - DatabaseCarlos CamargoView Answer on Stackoverflow