Posts in Category: aws

AWS – Amazon ElastiCache

Databases can often get the same query to proces, and each time it returns the same output over and over again. This can slow down the db’s performance.

To overcome this, you can use a caching service. This service will intercept the repetitive queries and send back the cached output. The database will keep the cacheing service up to date with the latest data.


There are  2 main caching software, they are:

Amazon ElastiCache is a fully managed in-memory cache engine. Once again, you can’t access the underlying OS that the cache engine is running on. You can set up Amazon ElastiCache to run using either of above 2 caching engines only.

Note, your application needs to be written in a way to support Redis/Memcached.

AWS – Amazon DynamoDB

“DynamoDB” is a “NoSQL” database rather than a relational db.

A relational db stores data about an object across a number of tables to avoid duplicate data. However in NoSQL, each object is stored in it’s own document. This means that  NoSQL performs faster compared to relational dbs since it doesn’t need to construct a “joined table”. sql db’s performance gets worse the more data it houses whereas with NoSQL it can scale more easily. However there are lots of data duplication in NoSQL therefore a  NoSQL db requires more disk space than sql db to store the same information.

Note: MongoDB is an example of a NoSQL database.


“Amazone DynamoDB” is a NoSQL  based fully-managed db managed service. As mentioned earlier, fully-managed covers:

  • auto-scaling up and down
  • underlying OS maintenance done by

AWS – Amazon Elastic MapReduce (EMR)

3Companies often wants to analyse huge amount of data obtain statistics/metrics, identify trends, or find meaningful insights. These can then be used to influence business decisions.

All this data crunching will require a lot of computing resources. One way to do this is to use a software like Hadoop to manage this for you. Hadoop will distribute the data crunching task evenly across a cluster of servers. However setting up a Hadoop cluster can be difficult, time consuming and expensive. That’s because:

  • You need to buy the servers to make up your cluster
  • Install your OS on them
  • install, configure, and tune Hadoop in your cluster

You need to do all this before you can start doing any data crunching.

However with Amazon Elastic MapReduce (EMR), you get a fully managed hadoop service already

AWS – Amazon Simple Workflow Service (SWF)

In companies you can document business processes in the form of swim-lane diagrams.

Some lanes requires human interaction (e.g. clicking the “approve” button), whereas other lanes are automated.

For example you went on a business meeting to another city and you want to claim for hotel/travel expenses. Then you submit it online. Then it goes to manager to approve it, then it goes to hr to approve it as well, then an automated system credits your bank account with a refund. All this can span over a couple of weeks.

SWF is designed to automate this as far as possible. SWF is long term processing workflow solution.

SWF has built in auto-scaling up/down (i.e. auto-create/destroy EC2 instance as and when necessary).

SWF can be configured to use onsite servers rather than EC2 instances.

SWF garantees execution

AWS – Amazon Simple Queue Service (SQS)

If you have incoming work requests, then you can queue them up if the existing EC2 instances are pre-occupied, rather than auto-scale up your EC2 resource.

This service works by a new message (in the form of a json file) being created by each incoming work request.  This json file get’s sent to the SQS service. We then configure all our ec2 instances to poll the sqs when they are idle, and process the next json file (if any) that is in the queue.

E.g. when a user uploads a video to youtube, that video would then need to be transcoded. The youtube website could upload the video to S3 and then “publish a message” to a “topic” in SNS. SNS in turn will then send “push” notifications to all subscription, one

AWS – Elastic Load Balancer (ELB) overview

ELB is used for distributing traffic to a group of servers. Hence a group of servers shares the same load so that each server doesn’t get overwhelmed.

It commonly uses the round robin approach to distribute work amongst the servers. However this approach can be adjusted by enabling “sticky sessions”. You can add EC2 instances to an elb. ELB recieves traffic via route 53 domain aliases. ELB can actually distribute traffic across EC2 instances that are in different availability zones, which maximizes redundancies. Hence if an instance (or AZ) goes down for whatever reason, then the ELB simply stop sending requests to that instance (or AZ).

ELB combined with autoscaling gives the following benefits:

  • fault tolerance (i.e. prevents an instance from burn out)
  • scalability
  • elasticity
  • high-availability (if one AZ goes down then remaining AZs jumps in

AWS – Amazon Elastic Cloud Compute (EC2) overview

EC2 is arguably the most important service available on the AWS platform,  EC2 is used for creating VMs on the AWS platform. These vms are referred to as EC2 “instances“. There are three pricing models available under EC2:



  • On-Demand Instances: This lets you pay for computing capacity by the hour with no long-term commitments.
  • Reserved Instances: This reserves an instance’s resource exclusively for your use. You get charged even if you don’t use them. But it garauntees that this resource is available in the given AZ. You can book your instance a period of one month, to up to 3 years. You can also sell it if you don’t need it anymore. You also have an option to ensure this vm runs on the same physical machine throughout

AWS – About this course

AWS CSA (associate level) – Amazaon Web Services Certified Solutions Architect

You can book this exam at:

After getting the AWS CSA (associate level), you can then move onto getting the AWS CSA (professional level) certification

Core Architecture Best Practices

You want to design your product along with the infrastructure they sit on, so that they are:

  • Scalable  – i.e. you app isn’t designed to only run on one vm, it can run on cluster  of VMs, but still appear as a single service.
  • Elastic – i.e. your system recognises when it is under heavy load, and spin up according. Similar it reduces in size when needed.
  • Fault Tolerant – e.g. if an vm fails then ELB identifies this and stops send jobs to failing vms. Instead redistributes to remaining vms
  • Self Healing – e.g. if an ec2

AWS – The CSA exam (Associate level)

Syllabus –