- Pramod Kumar
Microservices vs. Monolith is a never-ending war in the community of software engineers. My stand has always been, to do what works for you. No one solution works for all the problems. Both are having pros and cons.
Nevertheless, we had decided to move away from the Java and Tornado monolith servers to AWS Lambdas long back. It was worth a decision I would say. We could build so many niche tools which take our business to the next height as we did not have to spend so much time on maintaining the servers and scaling them ourselves.
Fast forward, currently, we have a few systems built in-house that act like independent SaaS solutions. We want to capitalize on this opportunity and offer in-house solutions as SaaS for other fintech.
There is one particular service that consists of at least 10 AWS lambdas that we would like to take out from our system and deploy in a completely isolated environment. It sounds simple, but it isn’t. The reasons are
- Multiple mini teams building their services at their own pace
- Services and tools built for AWS Lambdas
- Taking out a few services from a mesh of ~100 services
- AWS dependent
Along with handling above mentioned challenges, we wanted the solution to be
- Cloud provider agnostic
- Easy to maintain
- Most importantly, the development should still happen as it is
To start with, we had one solution in mind, Kubernetes! Ta-da! We immediately started exploring it. We started writing custom scripts that will take out the services and convert them into services/pods and then… it did not end. Slowly we understood how complicated this solution is, yet, we appreciate the problems K8s solve. It is a secured, high-grade production setup that one can do.
Even after working for a couple of months, along with the regular work of course (which product or business teams understand engineering concerns? XD), there are still so many nuances that we are figuring out.
We started exploring if there is any other easy way to deploy our SaaS along with the K8s solution. We needed a quick solution because there are other unknowns from the business POV post-deployment.
Taking a step back, had this been just a regular monolith server, it would be easy to deploy on any platform of our choice. We knew the issues with monolith servers as that is where we started. We are okay with this approach as this is just to experiment with so that we can figure out non-tech aspects of the SaaS. On the other hand, this solution should not take us more time to implement, otherwise, K8s would be a better option. We wanted to give it a shot! We made the requirements clear
- Make a monolith flask server out of 10 microservices
- Nothing should be altered in the service itself ie. the teams developing these services should not even know about this setup
- Cloud agnostic
- Just to experiment with the SaaS offering, not to scale at all
- Should be deployed as a docker image
- Should be done in not more than a week
Fortunately, all our services are built on one programming language ie. python. To start with all we had to do is to copy all services inside a folder, and have a server.py that starts the server and mounts all endpoints from all the services. Wait it is not so simple by the way. There are so many challenges throughout the process
We needed a flask server, which accepts */ path and lets us do what we want. That means, there is only one route mounted in the flask server. We built a custom routing layer, which calls a particular service route based on the URL (GET /service-a/user -> GET of /user from service-a).
As I already mentioned, our services are natively built for AWS Lambda. That means, there is a main function that takes an event and does the job. At a high level, the routing layer was just about constructing the event object in such a way that it invokes the desired function. It was pretty straightforward.
When all the services are copied inside a folder next to each other, because of the way they are built, we could not import the routes file just like that. There were import issues. We wrote a custom script that goes through each python file and fixes the import as per the new file structure. Once all the imports across the services are fixed, we were able to import the main file (which mounts the routes) from all the services and mount them to the monolith flask server. Hola!
As I already mentioned, all these services are being developed as an independent team. Every team has the freedom to choose its dependencies. That means, there can be import conflicts between the services in terms of versions. Once again, we wrote a script to read the requirements.txt from all the services and come up with a single set of requirements.
We took the liberty to assume that even though libraries we use (pandas, NumPy, requests, etc.) across the services vary in terms of versions, the code in the services would work with the latest one. Fortunately, that was true.
The script just reads all the requirements.txt from all the services and builds a common requirements.txt with the union of the packages with the max version if there is a dedupe.
#4 Inter-service communication
This is one of the important pieces that we were particular about. In the case of microservice architecture, these services are deployed as AWS Lambdas and they communicate with each other through the public internet (being AWS APIGateway in-between). The same would make no sense if they become a monolith. The services should be able to communicate with each other internally, just like a function call.
On contrary, we could not change the HTTP calls to something else, which means changing the actual code of the service, which we didn’t want to do to start with. So, we had to do something which changes this behavior in the monolith build process.
To give a little background info, we have a common interface called ApiClient which internally uses python’s requests library to make the inter-service HTTP calls. We took inspiration from testing mocks and we wrote a custom InternalApiClient which will be used as the ApiClient. This way, nothing is changed in the actual service, but replacing ApiClient with InternalApiClient at build.
We had to play around with the way python import cache works to mock it at a high level. The InternalApiClient just reads the URL and invokes the handles in the app. No HTTP call!
Again, the authentication and authorization part was common for all the services. We wanted to keep this solution independent of the Authentication part. Even this was possible with a custom layer just at the build.
This lets us make the app auth-less. To start with, we wanted to deploy this app without any login so that people can access it and play around. This was possible because the Authentication part is customized without affecting the actual behavior of the services.
#6 AWS Stuff
We were using SQS almost everywhere. We certainly did not want to bring another alternate like RabbitMQ or Kafka just to complicate it. We were clear about the requirements that it is just to experiment. We used native python Threading to replicate the behavior of SQS. It was working fine for our purpose.
In general, we let the UI go through AWS CloudFront but we wanted a simple solution yet again. We served the UI (build files) as static files from the flask server itself. It was working like a charm!
#8 Package & Deployment
All the above-mentioned process was coded in python itself. The script
- Clones the repositories
- Starts copying files to a /temp folder
- Mocks whatever is discussed above
- Some more magic
- Finally, a docker image is built
After doing all this, S3 was still there but we were okay with that as it is so common these days. After doing a few changes to the configs, and environments this was working fine locally as a docker container.
As a final step, we wanted to deploy this docker image on some cloud to test it remotely. We ended up trying the AWS LightSail. It was simple and easy to set up. All we had to do is to push the docker image to AWS ECR and run the deployment on LightSail. After a couple of trials and errors, it was up on a public, secured URL provided by the LightSail itself.
It was magical to see ~10s of independently built microservices, stitched together, without altering the actual service, communicating internally, with custom Auth!
After using this for a week, our product team could identify a few gaps to provide it as a SaaS platform and we started working on it already. We are still using this setup to check if the features that we are building for our requirements are generic enough to be part of a SaaS. Separately, the K8s solution is being developed and will be used for actual serving.
I purposefully did not show any of the code itself as I wanted this to be a high-level explanation. Feel free to reach me at (@pramodk73)[https://twitter.com/pramodk73]. Cheers!