Working with Amazon AWS S3

As part of the process of migrating off a personal hosting service and onto Amazon AWS and Github, I am moving various configurations to S3. I plan to store configuration information for some small scripts and configuration that run in AWS Lambda. I am also piping emails to my personal domain to an S3 bucket. In summary, I would like to accomplish the following:

  1. Read files from my emails bucket
  2. Publish code to an S3 bucket (from CLI or npm)
  3. Publish configuration to a separate S3 bucket (from CLI or npm)
  4. Read configuration from S3 bucket in node-based lambda script

It was not immediately clear to me how I would read those emails. I could see the bucket, and the names of the files, but couldn’t view the content through the AWS S3 web interface.

I started trying to do this in node, but soon feel back to the CLI as it seems better documented and easier to experiment with as I learn the S3 model. After following the setup steps, and having some initial success with aws s3 ls.

└─ $ ▶ aws s3 ls
2016-09-10 16:43:26 emails-bucket
2016-02-02 11:05:51 config-bucket
2016-07-23 11:27:30 code-bucket

After looking through the list of S3 commands I realized S3 is incredibly simple. It’s like a stripped down remote file share. You can create buckets, which contain files. You can copy files in and out of buckets, delete them, and move them, but that’s it. There is no reading from S3, or piping the contents elsewhere. There is only copy in and copy out. I suppose the closest you get to directly interacting with S3 is the presign command, which gives you a time-limited URL from which the resource can be accessed.

Reading files from an S3 bucket

Reading contents of my emails bucket looks like this.

└─ $ ▶ aws s3 sync s3://emails-bucket .
download: s3://emails-bucket/file1 to ./file1
download: s3://emails-bucket/file2 to ./file2

The sync command is like primitive rsync or sophisticated cp; it copies everything new from source to destination.

Reading configuration from S3 bucket in a node-based lambda script

If I want to separate code and configuration into different S3 buckets and access configuration dynamically from the config bucket, I can either copy the files locally or make them temporarily accessible via a presign URL. Alternately, I could store config and code in the same S3 bucket. That is probably better. Ideally I’d like to keep canonical configuration outside of the code repository. Perhaps I could have a dedicated canonical S3 bucket for configuration, but merge that into the code during the build step.

My main goal of keeping config and code separate is to avoid having sensitve config (e.g. friends’ email addresses) in my source repository, and to allow updates to configuration without needing to redeploy code. The easiest solution that gets most of those things is to have a separate bucket for configuration, and a deploy script that fetches config from S3 and replaces the fake config included in the repo.

Bummer. Lambda requires zips from S3, and it still seems to require manual upload every time it needs to be updated. Ah well.

Migrating to AWS Lambda

For a number of years I’ve had my own hosted website which I use to serve a wordpress site, host files and private git repos, and run cron jobs. My hosting provider (hostmonster) isn’t awesome, but is adequate. I can ssh into my box and do stuff. I have to set up cron stuff through the web-portal, which is painful. I just successfully migrated from wordpress to Jekyll / Github pages. I’ve moved my domain hosting to a different provider. The only remaining functionality I need to replicate is a regular reminder email containing the sermon passage for the upcoming week.

I’ve already proved that AWS SES can send the email more securely than my private hosting (encrypted). However, I need to run some small code to generate the email and send it to a recipient list every morning. It looks like I can use AWS Lambda for this. Lambda supports NodeJS, Java, and Python natively. Sadly, no love for Ruby (although people have made it work). Since I’m trying to up my JS chops, it’s probably a worthwhile exercise to rewrite my Ruby program in JS anyway.

One concern is whether my Lambda function will have access to a few static files (list of upcoming passages, list of recipients) in addition to the code. It appears you can give Lambda a ZIP file or an S3 location, so I should be able to access a few static files. You can trigger Lambda events as scheduled cron-style jobs. It should have everything I need!

I’ll probably use nodemailer to send the email. It looks like it shouldn’t be a problem to pull in that npm package as well.