How can I mount an S3 bucket to an EC2 instance and write to it with PHP?

PhpAmazon S3FuseS3fs

Php Problem Overview


I'm working on a project that is being hosted on Amazon Web Services. The server setup consists of two EC2 instances, one Elastic Load Balancer and an extra Elastic Block Store on which the web application resides. The project is supposed to use S3 for storage of files that users upload. For the sake of this question, I'll call the S3 bucket static.example.com

I have tried using s3fs (https://code.google.com/p/s3fs/wiki/FuseOverAmazon), RioFS (https://github.com/skoobe/riofs) and s3ql (https://code.google.com/p/s3ql/). s3fs will mount the filesystem but won't let me write to the bucket (I asked this question on SO: How can I mount an S3 volume with proper permissions using FUSE). RioFS will mount the filesystem and will let me write to the bucket from the shell, but files that are saved using PHP don't appear in the bucket (I opened an issue with the project on GitHub). s3ql will mount the bucket, but none of the files that are already in the bucket appear in the filesystem.

These are the mount commands I used:

s3fs static.example.com -ouse_cache=/tmp,allow_other /mnt/static.example.com
riofs -o allow_other http://s3.amazonaws.com static.example.com /mnt/static.example.com
s3ql mount.s3ql s3://static.example.com /mnt/static.example.com

I've also tried using this S3 class: https://github.com/tpyo/amazon-s3-php-class/ and this FuelPHP specific S3 package: https://github.com/tomschlick/fuel-s3. I was able to get the FuelPHP package to list the available buckets and files, but saving files to the bucket failed (but did not error).

Have you ever mounted an S3 bucket on a local linux filesystem and used PHP to write a file to the bucket successfully? What tool(s) did you use? If you used one of the above mentioned tools, what version did you use?

EDIT I have been informed that the issue I opened with RioFS on GitHub has been resolved. Although I decided to use the S3 REST API rather than attempting to mount a bucket as a volume, it seems that RioFS may be a viable option these days.

Php Solutions


Solution 1 - Php

> Have you ever mounted an S3 bucket on a local linux filesystem?

No. It's fun for testing, but I wouldn't let it near a production system. It's much better to use a library to communicate with S3. Here's why:

  1. It won't hide errors. A filesystem only has a few errors codes it can send you to indicate a problem. An S3 library will give you the exact error message from Amazon so you understand what's going on, log it, handle corner cases, etc.
  2. A library will use less memory. Filesystems layers will cache lots of random stuff that you many never use again. A library puts you in control to decide what to cache and not to cache.
  3. Expansion. If you ever need to do anything fancy (set an ACL on a file, generate a signed link, versioning, lifecycle, change durability, etc), then you'll have to dump your filesystem abstraction and use a library anyway.
  4. Timing and retries. Some fraction of requests randomly error out and can be retried. Sometimes you may want to retry a lot, sometimes you would rather error out quickly. A filesystem doesn't give you granular control, but a library will.

The bottom line is that S3 under FUSE is a leaky abstraction. S3 doesn't have (or need) directories. Filesystems weren't built for billions of files. Their permissions models are incompatible. You are wasting a lot of the power of S3 by trying to shoehorn it into a filesystem.

Two random PHP libraries for talking to S3:

https://github.com/KnpLabs/Gaufrette

https://aws.amazon.com/sdkforphp/ - this one is useful if you expand beyond just using S3, or if you need to do any of the fancy requests mentioned above.

Solution 2 - Php

Quite often, it is advantageous to write files to the EBS volume, then force subsequent public requests for the file(s) to route through CloudFront CDN.

In that way, if the app must do any transformations to the file, it's much easier to do on the local drive & system, then force requests for the transformed files to pull from the origin via CloudFront.

e.g. if your user is uploading an image for an avatar, and the avatar image needs several iterations for size & crop, your app can create these on the local volume, but all public requests for the file will take place through a cloudfront origin-pull request. In that way, you have maximum flexibility to keep the original file (or an optimized version of the file), and any subsequent user requests can either pull an existing version from cloud front edge, or cloud front will route the request back to the app and create any necessary iterations.

An elementary example of the above would be WordPress, which creates multiple sized/cropped versions of any graphic image uploaded, in addition to keeping the original (subject to file size restrictions, and/or plugin transformations). CDN-capable WordPress plugins such as W3 Total Cache rewrite requests to pull through CDN, so the app only needs to create unique first-request iterations. Adding browser caching URL versioning (http://domain.tld/file.php?x123) further refines and leverages CDN functionality.

If you are concerned about rapid expansion of EBS volume file size or inodes, you can automate a pruning process for seldom-requested files, or aged files.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBen HaroldView Question on Stackoverflow
Solution 1 - PhpBraveNewCurrencyView Answer on Stackoverflow
Solution 2 - PhpPlayGodView Answer on Stackoverflow