This is little more details about implementation.
Each of plug-ins are running as go process and Infrakit discovers them through Unix socket.
And plug-ins communicate each other with rpc.
Now, I will talk about the flow of using Infrakit.
First, you send a configuration of cluster to Infrakit and it will be sent to group plug-in.
In this example, Using default group plug-in, aws instance plug-in and docker swarm flavor plug-in.
And you define the desired state of your cluster.
Now, it is 2 masters and 4 workers for docker swarm.
Then the group plugin check the state of your cluster through flavor and instance plug-in.
now, there are no instance.
The group plugin send the required number of instances to instance and flaver plug-ins with rpc.
It need to deploy 2 masters and 4 workers now.
And then, Instance plug-in gather information that will be needed to deploy instance from flavor plug-in and deploy instances by cloud provider’s API.
Now, some failure occurred in your cluster.
One of your worker died unexpectedly.
Group plug-in has been polling the state of your cluster.
How many instances are running? Are all nodes healthy?
So the Group plug-in can notice a node has gone down because the instance plug-in report number of only healthy nodes.
In this case, instance plugin reports 2 master and only 3 workers are running.
Then, the group plug-in will request to instance and flavor plug-ins to maintain your desired state.
This is the basic behavior of provisioning and auto-healing of Infrakit.