Close

AWS – Cloudfront

Cloudfront is a (Content Delivery Network) CDN that delivers content to Edge Locations around the world.

Origin: This term means the location where the content can originate from. There are a few places:

  • EC2
  • ELB, with ec2 instances behind it
  • S3 bucket
  • route53 - e.g. if actual source is an microsoft azure vm.

 

 

Edge locations effectively stores cached content.

Distribution: This term is the name for a given "cloudfront resource". You can create  two types of distributions:

  • Web distribution - most commonly used, e.g. cdn for static website
  • RTMP - Used for media streaming.

 

you can restrict access to your distributed content using one of the following cloudfront features:

  • signed urls
  • signed cookies

Objects in an Edge Location are cached for the life of the TTL (Time To Live). But can delete these sooner, but you will be charged for this.

Edge Locations are normally first responders of requests, if it doesn't have the required content cached, then it would forward the request to the origin, and it will receive the requested content. The edge location will then cache the content (for future requests) and then forward the content on to the requester.

Advantages:

  • faster response time - hence better experience to for the data requesters
  • less strain on the content origins

Edge location is not limited to read only. You can also write to them, and Edge Location can forward this onto the origin.

 

Edge locations will use cached version until either:

  • We rename the source content's filename, and reference the new filename in our code. E.g. if we update the css file, then we rename the css file. After that we update our  html code to reference the new name. This will work as long as our html code isn't cached but our css files are. Similar approach for jpegs, pngs,...etc.
  • We tell the edge location to stop using the cached version - this is done by invalidating the file. Cloudfront lets you invalidate up to 100 objects per month for free. Then it starts costing. If you need to go over this limit, then it's more cost effective to create a new cloudfront distro containing the refreshed content and delete the old cloudfront distro. Then update our route 53 entry (discussed below) to point to the new Cloudfront distro.

 

We can attach expiration dates for cached contents. or disable  caching for particular content.

 

With CloudFront you can also create private (long) urls that has expiration dates. This is great for sharing content privately by sharing the urls, a bit like how github's gist system works.

 

When using cloudfront, you get a unique long cloudfront url, for each distributed content sources. If you want, you can create a cname recordset in route53 to reroute traffic directly to cloudfront. This might be handy:

  • if you want to take our content offline temporarily to do maintenance work.
  • If you have written apps that requests for content from cloudfront. In that case you don't want to hard code the long cloudfront url, instead you want to use a userfriendly url which can easily be repointed elsewhere as and when needed.

 

CloudFront AWS console walkthrough

To use cloudfront, we need create a cloudfront resource, which is referred to as a "distribution":

Next we have 2 options, in most cases the web option is the way to go:

 

Then we have:

The first field, the origin domain name is actually a drop down list where you can choose available ELBs or S3 buckets:

 

 

Further down we have:

Here you can specify how global you want your cdn to reach. E.g. us only, us+europe, or us+europe+asia, or all edge locations, which is the most expensive option.

 

You also have to specify the route 53 alias that represents your s3 bucket or ELB.

 

After you have created your distribution, you can then monitor various statistics for your distro:

  • cache statistics, e.g.:
    • number of objects that are currently cached
    • hits metric: number of objects being delivered by your cloudfront distro
    • misses metric: number of objects being delivered from source (and then subsequently cached for next time)
    • errors metric: number of 404/500 error messages being delivered
    • geographic origin of requests, a bit like google analytics
  • You can set alarms to be triggered on certain actions. E.g. if too many objects are being delivered, which makes cloudfront expensive.
  • popular objects in ranked order
  • usage info: e.g.
    • number of http requests
    • number of https requests
    • data transfers in terms of MB, GB,..etc.
  • statistics about requesting devices, e.g.
    • what os, mac, windows
    • what browser, chrome, firefox
    • what device, desktop, tablet, phone

 

 

When adding an alias in route 53 for your cloudfront distro, you need to name it, with the same info you gave in the 'Alternate Domain Names' you specified above, at the time of creating the cloudfront distro.  In the alias target section, you can select your cloudfront cname  from the dropdown list, so that it all matches up

If the origin you are using is s3, then you also need to make sure your s3 permissions for shared files is changed to 'public'. Otherwise you will get a permissioned denied message. Alternatively you can set cloudfront permission to override s3 bucket permissions as described further down:

 

 

Customizing your cloudfront distribution

Once you have created your cloudfront resource, you can then edit it. this include:

  • you can add additional origins to your cloudfront distro. You can then specify which origins to use under the "bahaviours" tab.
  • you can set permissions so that cloudfront permissions overrides s3 bucket permissions, for read access. To do this you need to enable your distro's "restrict bucket access" setting, and you specify an existing pem file, then cloudfront will retrieve the files in s3, not by http, but via the aws api, using your chosen private key, .i.e. .pem file
  • Under the behaviours, you can specify wildcard  expression (for the requesting url), e.g. *.png, or images/*, and map these expressions to an origin (which are selectable via a drop down list). you can also specify what ports to deliver content via, e.g. http. You can also specify caching retention rules here too.
  • You can write your own custom 404/500 error pages, that your distro can serve up on a failed request.
  • you can set geo based restriction, e.g. make your content inaccessible for people in a certain country.