Since each service has its own single responsibility and is independent. Failure is contained and isolated in one service. If one of the service is down, it won’t affect the rest of the site.
Failure is treated as a first class citizen.
- Redundancy for machine/ cluster/ data center failures
- Load-balancing and flow control for service invocations
- Client protects itself with standard failure management patterns: timeouts, retries, circuit breakers.
Resiliency has to be tested too. Resiliency test or game day should be execute on the production system, making sure the servers response well under failure.
Api gateway (netflix)/ api facades (500px)
Api gateway/facades owned by web team which basically a proxy to the various backend systems. Since its own by the web team, there’s no need idle time to wait for the service team to build the service. Also, it functions as an abstract concurrency, which can pull data from various services and flatten them.
Persistence between services should not be shared. This break the independency in both development and deployment of a service. However, this is one of the aspect of microservice that practitioners most struggled with, esp those with existing monolithic server which share the same database.
Using smart client loadbalancer. Netflix has opensourced Ribbon for this purpose. It can avoid the whole zone if it’s go bad.
Microservice does not automatically mean better availability. If a service failed, the consumer service can go into a loop and soon all available thread will be consumed. Employ circuit breaker, retries, bulk heading and fallbacks. Hystrix from Netflix provides supports for these patterns.
Test service for resiliency
Test for resiliency as failure in production would happens. Need to test for latency/error (Simian Army), dependency service unavailability and network errors. The system should be optimized for responding to failures.
Rest.li + deco
Rest.li is the RESTful services framework from LinkedIn which also give type-safe bindings, async, non-blocking I/O. It also provides dynamic service discovery and client side load balancing using zookeeper.
All domain model in LinkedIn is normalized, the link between two domain model is by using URN, a fully quantified foreign key (e.g. urn:li:member:123). Deco is a URN resolution library. Services can query Deco by what data it needs (not how) and Deco can look into the URN and fetch the data for the service.
Dynamic load balancer
Mailgun implemented a dynamic loadbalancer which basically includes service registration/discovery, load balancing, circuit breaker (retries) and canary deployment into one. New service can simply start and will register to the load balancer and other services can start consuming it. Multiple version of services can be run at the same time, thus enabling canary deployment. Without disrupting the existing services, we can start new version of the service and then start the consumer with will use the new version, so we can retire old service gracefully.
Goal of an elastic system is to:
- Scale up and down service instances according to load
- Gracefully handle spiky workloads
- Predictive and reactive scaling
It happens when there is a conflict over access to a shared resource.
- Message-driven/async. Service should only hold on to a resource when it’s using it. The traditional BIO will block the thread when it’s waiting for some I/O operations, so the resource is not release and can cause contention.
- Reactive all the way down. However, in some case it’s not possible. The best example is the use of JDBC. There’s no async driver (except for an experimental driver for Oracle and mysql.) This is also one of the reason moving to NoSQL. One possible way to deal with that is to use Actor model.
- Beside idle threads, shared/distributed state also constitute contention point. It should be moved to the edge, like the client or external datastore, to reduce contention.
One of the challenges of implementing microservice is fan out, both on the UI and among services. Fan out on the UI can be resolved by using API gateway which push the fan out problem to the server (with lower network latency) by returning a materialized view instead. Server caching can be used to reduce the amount of load to the services and instead of caching a service, we can cache the composite/materialized view.
Services like user service might be required by multiple services in the same flow. Those services become the hotspot of the system. To reduce the dependency, we can pass data through header instead of having all individual services depends on it.
- Asynchronous message-passing over synchronous request-response
- Often custom protocol over TCP/UDP or WebSockets over HTTP
An example of backpressure is user uploading file. Traditional, the web server will save it to a local file, and then upload to the storage (express is doing this.) With streaming, the web server can pipe the file to the storage while the user is uploading it. But what If the user upload faster than the server can save to the storage, backpressure is needed to control the flow. Stream can be throttled the upload from the user, so it’ll be the same speed to store the file. So backpressure is
- Sensible behavior on resource contention
- Avoids unnecessary buffering
Back to Rx. There’s 2 kinds of observable, hot & cold.
Typically we’ll use temporal operator – sample (take a sample of the stream in a time window), throttleFirst (take the first value throttled in a time window), debounce (wait until it pause a bit), buffer (base on time) or window (by time or count). Or combining buffer and debounce.
Dynamic push-pull. Push (reactive) when consumer keeps up with producer & switch to pull (interactive) when consumer is slow. Bound all* queues, vertically not horizontally. Backpressure of a given stream not all stream together.
Cold stream support pull by definition as we can control the speed. Hot receives signal to support pull.
So on backpressure we can use the backpressure signal to choose different strategy. We can buffer on memory, on Kafka or on disk. We can drop or sample the stream or to scale it horizontally (Netflix can stand a new server on Mesos).