I've been working on a web shop integration project from time to time for last six months. The use case is that there are several different instances of a webshop (slave) and one master shop. All product information should be integrated from the master shop to the slaves on regular intervals. There are also some business rules applied to the products upon integration.
The integration is set to happen on a certain time of day. It is a relatively long process because it checks through all the products in all shops. That's fine though, there is no requirements on how quick the integration should be. It should rather be a resource constrained process so it would not affect the users using the web shops.
Originally I thought this would be a perfect use case for serverless application running in AWS Lambda for example. I also tried out OpenWhisk from IBM which can be run as a self-hosted serverless platform. While it would have been interesting to try out those technologies I ended up using my DigitalOcean virtual server purely to save costs.
I wrote the app with Scala. I wanted to know the language better and also wanted to see how Akka works in the language it is built in (if all you have is a hammer...). The application is really simple, one just triggers the integration task, it will connect to the webshops to fetch all their product information, cache that to RocksDB, run the transformation rules and the updates the slave webshops.
The application has a configurable amount of workers which do the operations (fetch product data, store it...). The workers are essentially actors which do a lot of blocking operations. This is usually a problem especially if the thread pool for Akka is the default one. The operations using REST APIs of the shops are not exactly fast which can result in quick drainage of Akka threads.
The Akka default pool can be configured to have a fixed number of threads on startup. The amount of workers is derived from the maximum threads available. Each shop interface has a set of workers which is roughly calculated from the amount of threads Akka has divided by the number of shops minus a constant to give some processing time to the aggregate actors. This ends up working pretty well especially when I want to limit the load to the webshops and also keeps the integration application itself from hogging all my 1 core virtual machine resources.
Comments
Post a Comment