Only accept a certain file type in FileField, server-side

DjangoFile Upload

Django Problem Overview


How can I restrict FileField to only accept a certain type of file (video, audio, pdf, etc.) in an elegant way, server-side?

Django Solutions


Solution 1 - Django

One very easy way is to use a custom validator.

In your app's validators.py:

def validate_file_extension(value):
    import os
    from django.core.exceptions import ValidationError
    ext = os.path.splitext(value.name)[1]  # [0] returns path+filename
    valid_extensions = ['.pdf', '.doc', '.docx', '.jpg', '.png', '.xlsx', '.xls']
    if not ext.lower() in valid_extensions:
        raise ValidationError('Unsupported file extension.')

Then in your models.py:

from .validators import validate_file_extension

... and use the validator for your form field:

class Document(models.Model):
    file = models.FileField(upload_to="documents/%Y/%m/%d", validators=[validate_file_extension])

See also: https://stackoverflow.com/questions/6460848/in-django-how-does-one-limit-file-types-on-file-uploads-for-modelforms-with-fil.

> Warning
>
> For securing your code execution environment from malicious media files > 1. Use Exif libraries to properly validate the media files. > 2. Separate your media files from your application code > execution environment > 3. If possible use solutions like S3, GCS, Minio or > anything similar > 4. When loading media files on client side, use client native methods (for example if you are loading the media files non securely in a > browser, it may cause execution of "crafted" JavaScript code)

Solution 2 - Django

Django in version 1.11 has a newly added FileExtensionValidator for model fields, the docs is here: https://docs.djangoproject.com/en/dev/ref/validators/#fileextensionvalidator.

An example of how to validate a file extension:

from django.core.validators import FileExtensionValidator
from django.db import models


class MyModel(models.Model):
    pdf_file = models.FileField(
        upload_to="foo/", validators=[FileExtensionValidator(allowed_extensions=["pdf"])]
    )

Note that this method is not safe. Citation from Django docs:

> Don’t rely on validation of the file extension to determine a file’s > type. Files can be renamed to have any extension no matter what data > they contain.

There is also new validate_image_file_extension (https://docs.djangoproject.com/en/dev/ref/validators/#validate-image-file-extension) for validating image extensions (using Pillow).

Solution 3 - Django

A few people have suggested using python-magic to validate that the file actually is of the type you are expecting to receive. This can be incorporated into the validator suggested in the accepted answer:

import os
import magic
from django.core.exceptions import ValidationError

def validate_is_pdf(file):
    valid_mime_types = ['application/pdf']
    file_mime_type = magic.from_buffer(file.read(1024), mime=True)
    if file_mime_type not in valid_mime_types:
        raise ValidationError('Unsupported file type.')
    valid_file_extensions = ['.pdf']
    ext = os.path.splitext(file.name)[1]
    if ext.lower() not in valid_file_extensions:
        raise ValidationError('Unacceptable file extension.')
       

This example only validates a pdf, but any number of mime-types and file extensions can be added to the arrays.

Assuming you saved the above in validators.py you can incorporate this into your model like so:

from myapp.validators import validate_is_pdf

class PdfFile(models.Model):
    file = models.FileField(upload_to='pdfs/', validators=(validate_is_pdf,))

Solution 4 - Django

You can use the below to restrict filetypes in your Form

file = forms.FileField(widget=forms.FileInput(attrs={'accept':'application/pdf'}))

Solution 5 - Django

There's a Django snippet that does this:

import os

from django import forms

class ExtFileField(forms.FileField):
    """
    Same as forms.FileField, but you can specify a file extension whitelist.
    
    >>> from django.core.files.uploadedfile import SimpleUploadedFile
    >>>
    >>> t = ExtFileField(ext_whitelist=(".pdf", ".txt"))
    >>>
    >>> t.clean(SimpleUploadedFile('filename.pdf', 'Some File Content'))
    >>> t.clean(SimpleUploadedFile('filename.txt', 'Some File Content'))
    >>>
    >>> t.clean(SimpleUploadedFile('filename.exe', 'Some File Content'))
    Traceback (most recent call last):
    ...
    ValidationError: [u'Not allowed filetype!']
    """
    def __init__(self, *args, **kwargs):
        ext_whitelist = kwargs.pop("ext_whitelist")
        self.ext_whitelist = [i.lower() for i in ext_whitelist]

        super(ExtFileField, self).__init__(*args, **kwargs)

    def clean(self, *args, **kwargs):
        data = super(ExtFileField, self).clean(*args, **kwargs)
        filename = data.name
        ext = os.path.splitext(filename)[1]
        ext = ext.lower()
        if ext not in self.ext_whitelist:
            raise forms.ValidationError("Not allowed filetype!")

#-------------------------------------------------------------------------

if __name__ == "__main__":
    import doctest, datetime
    doctest.testmod()

Solution 6 - Django

First. Create a file named formatChecker.py inside the app where the you have the model that has the FileField that you want to accept a certain file type.

This is your formatChecker.py:

from django.db.models import FileField
from django.forms import forms
from django.template.defaultfilters import filesizeformat
from django.utils.translation import ugettext_lazy as _

class ContentTypeRestrictedFileField(FileField):
    """
    Same as FileField, but you can specify:
        * content_types - list containing allowed content_types. Example: ['application/pdf', 'image/jpeg']
        * max_upload_size - a number indicating the maximum file size allowed for upload.
            2.5MB - 2621440
            5MB - 5242880
            10MB - 10485760
            20MB - 20971520
            50MB - 5242880
            100MB 104857600
            250MB - 214958080
            500MB - 429916160
"""
def __init__(self, *args, **kwargs):
    self.content_types = kwargs.pop("content_types")
    self.max_upload_size = kwargs.pop("max_upload_size")

    super(ContentTypeRestrictedFileField, self).__init__(*args, **kwargs)

def clean(self, *args, **kwargs):        
    data = super(ContentTypeRestrictedFileField, self).clean(*args, **kwargs)
    
    file = data.file
    try:
        content_type = file.content_type
        if content_type in self.content_types:
            if file._size > self.max_upload_size:
                raise forms.ValidationError(_('Please keep filesize under %s. Current filesize %s') % (filesizeformat(self.max_upload_size), filesizeformat(file._size)))
        else:
            raise forms.ValidationError(_('Filetype not supported.'))
    except AttributeError:
        pass        
        
    return data

Second. In your models.py, add this:

from formatChecker import ContentTypeRestrictedFileField

Then instead of using 'FileField', use this 'ContentTypeRestrictedFileField'.

Example:

class Stuff(models.Model):
    title = models.CharField(max_length=245)
    handout = ContentTypeRestrictedFileField(upload_to='uploads/', content_types=['video/x-msvideo', 'application/pdf', 'video/mp4', 'audio/mpeg', ],max_upload_size=5242880,blank=True, null=True)

Those are the things you have to when you want to only accept a certain file type in FileField.

Solution 7 - Django

after I checked the accepted answer, I decided to share a tip based on Django documentation. There is already a validator for use to validate file extension. You don't need to rewrite your own custom function to validate whether your file extension is allowed or not.

https://docs.djangoproject.com/en/3.0/ref/validators/#fileextensionvalidator

> Warning > > Don’t rely on validation of the file extension to determine a file’s > type. Files can be renamed to have any extension no matter what data > they contain.

Solution 8 - Django

I think you would be best suited using the ExtFileField that Dominic Rodger specified in his answer and python-magic that Daniel Quinn mentioned is the best way to go. If someone is smart enough to change the extension at least you will catch them with the headers.

Solution 9 - Django

You can define a list of accepted mime types in settings and then define a validator which uses python-magic to detect the mime-type and raises ValidationError if the mime-type is not accepted. Set that validator on the file form field.

The only problem is that sometimes the mime type is application/octet-stream, which could correspond to different file formats. Did someone of you overcome this issue?

Solution 10 - Django

Additionally i Will extend this class with some extra behaviour.

class ContentTypeRestrictedFileField(forms.FileField):
    ...
    widget = None
    ...
    def __init__(self, *args, **kwargs):
        ...
        self.widget = forms.ClearableFileInput(attrs={'accept':kwargs.pop('accept', None)})
        super(ContentTypeRestrictedFileField, self).__init__(*args, **kwargs)

When we create instance with param accept=".pdf,.txt", in popup with file structure as a default we will see files with passed extension.

Solution 11 - Django

Just a minor tweak to @Thismatters answer since I can't comment. According to the README of python-magic:

> recommend using at least the first 2048 bytes, as less can produce incorrect identification

So changing 1024 bytes to 2048 to read the contents of the file and get the mime type base from that can give the most accurate result, hence:

def validate_extension(file):
    valid_mime_types = ["application/pdf", "image/jpeg", "image/png", "image/jpg"]
    file_mime_type = magic.from_buffer(file.read(2048), mime=True) #  Changed this to 1024 to 2048

    if file_mime_type not in valid_mime_types:
        raise ValidationError("Unsupported file type.")

    valid_file_extensions = [".pdf", ".jpeg", ".png", ".jpg"]
    ext = os.path.splitext(file.name)[1]

    if ext.lower() not in valid_file_extensions:
        raise ValidationError("Unacceptable file extension.")

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmaroxeView Question on Stackoverflow
Solution 1 - DjangoSaeXView Answer on Stackoverflow
Solution 2 - DjangoillagrenanView Answer on Stackoverflow
Solution 3 - DjangoThismattersView Answer on Stackoverflow
Solution 4 - DjangosavpView Answer on Stackoverflow
Solution 5 - DjangoDominic RodgerView Answer on Stackoverflow
Solution 6 - DjangoAmazing AngeloView Answer on Stackoverflow
Solution 7 - DjangoOğuzhanView Answer on Stackoverflow
Solution 8 - Djangoman2xxlView Answer on Stackoverflow
Solution 9 - DjangosabrinaView Answer on Stackoverflow
Solution 10 - DjangogaueeView Answer on Stackoverflow
Solution 11 - DjangoPrynsTagView Answer on Stackoverflow