Improving app performance with Amazon ElastiCache
Life has definitely gotten in the way recently, but I'm pleased to report that I passed the AWS Developer Associate exam on July 6th with a scaled score of 959 (out of 1000)! I was a little nervous about my level of preparation going into the exam, but those worries were unfounded.
Just before I was originally scheduled to take my exam, a new ACloudGuru #CloudChallenge was posted, and I was immediately excited to try it out and get some experience with a new-to-me service, ElastiCache. The premise is simple - deploy the provided application (in this case, Flask) and use it to query a PostgreSQL RDS database. Then, modify the application to query a Redis cluster of ElastiCache to see the increase in data retrieval speed. The database was set up with a 5-second delay to imitate a complex query.
Once I was finally ready to tackle this challenge, I spent the first couple hours researching and fretting about trying to deploy some or most of the needed resources as IaC (infrastructure as code). I eventually came to the conclusion that the best course of action would be to keep things simple by deploying and modifying things manually, with the option to return to the challenge later and use CloudFormation or Terraform for resource provisioning. While I had originally hoped to use this challenge as rationale for learning Terraform (and I intend to do so later), I put IaC on hold in the name of making progress.
I started out by launching a t2.micro PostgreSQL instance on Amazon RDS (Relational Database Service), creating a database on the instance, and using the provided instructions to create a procedure on the database to be called by the application. I then launched an EC2 instance to host the application, installed some python modules, and ran right into the first roadblock: how to install an nginx server on an Amazon Linux AMI. Based on the procedures in the nginx tutorial (which called for using apt, not yum), I branched out and replaced my instance with one launched from an Ubuntu AMI. I imagine there are instruction on how to install and configure an nginx server on the default Amazon Linux AMI, but I wanted to follow the 'official' instructions.
Once the Ubintu instance was up and running, I was able to breeze through the nginx instructions, and after one brief hiccup, I had a proxy server up and running. I deployed the provided application code to the EC2 instance, configured the necessary database details in the appropriate .ini file, and had no issues connecting to the RDS database. Success! (Partially).
Amazon ElastiCache was a new service to me; I understood the high-level concept of an in-memory cache, but had never played around with one. I beelined for the documentation, and followed the instructions there to create a subnet group and deploy a Redis cluster to my default VPC. This turned out to be very straightforward, and I made sure to create a special security group for the cluster that only allowed inbound traffic from the security group being used by my EC2 instance.
I was stymied for a few minutes at this point, trying to figure out how I would access the cluster, and then I realized that the python redis modue was the key. Connecting to the cluster was simple with the help of the module's documentation. I spent several minutes reading through the Flask app to understand what was going on in the code so I could decide on a plan of attack. Fortunately, I spent the past winter and early spring learning Python via an online bootcamp, so I didn't have much difficulty with the read-through or the Flask structure (this site is a Flask app, after all).
I decided to tackle the cache query inside the @app.route("/") decorator. After testing the cache response in the console, I refactored the code to only query the database if the desited data was not stored in the cache. After fixing a couple small bugs, my results:
Page load elapsed time, database query: 5.02+ seconds
Page load elapsed time, ElastiCache query: 0.002 to 0.02 seconds
Challenge complete! Python code on GitHub
I resisted the temptation to look at other challenge participants' code and blog posts, in order to force myself to come up with my own solution. I did take a peek at several other solutions after I had successfully deployed my modified app, and it was interesting to see that many other people modified one or more functions outside of the Flask route decorators. In retrospect, this does make sense, and I'd guess many people would argue it is better coding practice to refactor the code that way, instead of using my method. I still consider myself fairly novice at coding, so I appreciate being able to read other people's code to analyze different solutions and think about why those choices were made.
I also came away with some good hands-on experience, particularly with EC2 and ElastiCache. After a few other high-priority projects (like job applications!), I do want to revisit resource deployment at the very least, and perhaps also see if I can refactor my Python code differently to get the same result.
Onward and upward!