Tuesday, October 24, 2017

Communicate or die

From web services to micro-services

All the world talks about the "Cloud". Along with this came the buzzword "micro-services", Now the emphasis is on Machine Learning and Artificial Intelligence. We will leave the latter for later. I will not go into an argument about how micro-services is the new SOA, SOA done right or special case of component-based architecture. Those topics have been done to death by everyone and their dog. Don't even get me started about "Simple Object Access Protocol'. In this post I would like to think aloud about a specific communication pattern between services.

Make no mistake about it, there are no silver bullets in the world. My brother is a civil engineer. He designs and builds things that I can only marvel at. I was fortunate in being able to work through his textbooks on hydrodynamics, strength of materials, bridge design and other esoteric concepts. One point that all of those design-related works emphasized was that what you gain on the straight-aways you lose on the roundabouts. In the technical architecture world this is much more evident.

The idea behind micro-services is to have a modular service that may be looked up, is fault-tolerant and can be set up in a cluster. Services are not without side-effects. Side-effects occurring in one service may need to be communicated to another service (or a group of services). How can these services know about each other? If one service stores information about another service, then the services are no longer independent. We may have yet another service that stores information about the services. So when any service needs to communicate some information to another service, all it has to do is consult the 'locator' service. But then this 'locator' service becomes a bottle-neck and point-of-failure. Luckily there are industry-standard 'locator' services that you may weave into your product

Alright, 'where' is solved

'Locator' services help looking up services. But a service might expose multiple endpoints. Of course, these endpoints can change as and when needed. This means that either the 'locator' service needs to carry this information or else the calling service needs to carry it. In either case we then have to 'flush' the service. This also becomes a somewhat unacceptable level of inter-service knowledge. Think also about the case where state change in one service needs to be communicated to more than one service with the source service not having to know which all services need a piece of information at a given point of time.

Eventing subsystems to the rescue

Almost all programming languages have the concept of events and event listeners. They allow the source of state changes to supply events that are then handled, in their own way, by interested listeners. Listeners have the responsibility of registering with the source of events that they would like to be informed about. This model can be extended to the world of micro-services. If we set up a registration-based system for services, listener services can sign up for receiving events. The service that accepts the registration should also accept information about the listener endpoint. So now we have the 'locator' service supplying the location of the target service and 'event transmitter' service that contains the target service's endpoint information. This 'event transmitter' service can be made as simple or as complex as your service habitat demands. It can support complex filtering rules that determine whether an event needs to be supplied to a registered target service, chunking the events if the size exceeds certain limit, securely signing the events, to give some examples. It can also be an 'active' or 'passive' (or both) component. Operating in the 'active' mode, the 'event transmitter' would actively try to deliver incoming events to targets. In 'passive' mode the service would wait till a registered service asks for events.

But you never told me

What happens when the target service has gone down or is not reachable? Your active 'event transmitter' will still try to pass on the events, but they will not reach the intended recipient. If you are using a 'passive' mode and there is an expiration time set on the event, then it may happen that the target service does not ask for events before the expiration time has passed. When building component systems we always have to design for failure.
One solution for this 'failure in delivery' is to have a dedicated 'delivery error' event stream maintained by the 'event transmitter'. Any event that could not be delivered to a target is dumped into this error stream along with information about which target was unable to receive this event. The error stream may be made persistent.

And so on .....

This 'error stream' can be put to uses that are only limited by your imagination. It can be used by your 'monitoring service' to track endpoints going out of reach. It can be used by the target service to recover lost events when they come back up.
Hopefully you now have an inkling of the role of events in a world of interacting services and how the rough edges may be smoothed out. The rest is left as exercises in ingenuity.

No comments:

Post a Comment