Exclude multiple folders using AWS S3 sync

Amazon Web-ServicesAmazon S3S3cmd

Amazon Web-Services Problem Overview


How to exclude multiple folders while using aws s3 syn ?

I tried :

    # aws s3 sync s3://inksedge-app-file-storage-bucket-prod-env \ 
                  s3://inksedge-app-file-storage-bucket-test-env \
                  --exclude 'reportTemplate/* orders/* customers/*'

But still it's doing sync for folder "customer"

Output :

    copy: s3://inksedge-app-file-storage-bucket-prod-env/customers/116/miniimages/IMG_4800.jpg
       to s3://inksedge-app-file-storage-bucket-test-env/customers/116/miniimages/IMG_4800.jpg

    copy: s3://inksedge-app-file-storage-bucket-prod-env/customers/116/miniimages/DSC_0358.JPG
       to s3://inksedge-app-file-storage-bucket-test-env/customers/116/miniimages/DSC_0358.JPG

Amazon Web-Services Solutions


Solution 1 - Amazon Web-Services

At last this worked for me:

aws s3 sync s3://my-bucket s3://my-other-bucket \
            --exclude 'customers/*' \
            --exclude 'orders/*' \
            --exclude 'reportTemplate/*'  

Hint: you have to enclose your wildcards and special characters in single or double quotes to work properly. Below are examples of matching characters. for more information regarding S3 commands, check it in amazon here.

*: Matches everything
?: Matches any single character
[sequence]: Matches any character in sequence
[!sequence]: Matches any character not in sequence

Solution 2 - Amazon Web-Services

For those who are looking for sync some subfolder in a bucket, the exclude filter applies to the files and folders inside the folder that is be syncing, and not the path with respect to the bucket, example:

aws s3 sync s3://bucket1/bootstrap/ s3://bucket2/bootstrap --exclude '*' --include 'css/*'

would sync the folder bootstrap/css but not bootstrap/js neither bootstrap/fonts in the following folder tree:

bootstrap/
├── css/
│   ├── bootstrap.css
│   ├── bootstrap.min.css
│   ├── bootstrap-theme.css
│   └── bootstrap-theme.min.css
├── js/
│   ├── bootstrap.js
│   └── bootstrap.min.js
└── fonts/
    ├── glyphicons-halflings-regular.eot
    ├── glyphicons-halflings-regular.svg
    ├── glyphicons-halflings-regular.ttf
    └── glyphicons-halflings-regular.woff

That is, the filter is 'css/*' and not 'bootstrap/css/*'

More in https://docs.aws.amazon.com/cli/latest/reference/s3/index.html#use-of-exclude-and-include-filters

Solution 3 - Amazon Web-Services

From a Windows command prompt, only double quote works so use " " around wildcards, eg:

aws s3 sync  s3://bucket-1/ . --exclude "reportTemplate/*" --exclude "orders/*"

Single quote doesn't work (as tested with the --dryrun option) on Windows 10.

Solution 4 - Amazon Web-Services

I used a bit of a different way when we have multiple levels of folder structure. Use '**' with --include

Command:

aws s3 sync s3://$SOURCE_BUCKET/dir1/dir2/ s3://$TARGET_BUCKET/dir1/dir2/ --include "**/**'

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAshish KarpeView Question on Stackoverflow
Solution 1 - Amazon Web-ServicesAshish KarpeView Answer on Stackoverflow
Solution 2 - Amazon Web-ServicesRaphael FernandesView Answer on Stackoverflow
Solution 3 - Amazon Web-ServicesInnocentBystanderView Answer on Stackoverflow
Solution 4 - Amazon Web-ServicesDharmesh PurohitView Answer on Stackoverflow