On 16/03/16 22:06, Sam Geeraerts wrote:
If we can assume that the version with the latest timestamp is the one
that should be active, then it doesn’t matter if a server is down. When
a server comes up it can negotiate service timestamps with the other
servers and then get the latest version of everything.
Is there a good reason to not make that assumption?
You are right except that currently what you deploy is not a service
actually but a Python module plus there is no notion of synchronizing
based on a timestamp - though definitely this could be used.
The problematic scenario I had in mind was:
- There is a cluster of s1, s2 and s3
- s2 and s3 are down
- s1 receives m1.py in v1
- s1 goes down, s2 is up
- s2 receives m1.py in v2
- s3 boots up - what should it synchronize to now?
As things stand today, without timestamps, there is no good answer to
that question.
But let’s assume that s3 gets m1.py v2 so now s1 on startup should also
check if its m1.py is OK since it may have been changed possibly and if
it did, m1.py v2 should be deployed.
This all is nice and can be added but it needs to wait for the next
major release that will introduce quite a bit of changes into the
internal architecture to better control the process of server startup.
Right now we are using gunicorn and it works very nice but we need much
tighter grip on how server processes are started, how they report it to
the coordinator (the process which spawns actual server processes), upon
which events and so on. Meaning we will have to use our own process
controller instead of gunicorn - gevent will stay though, it is a great
library.
This is needed anyway to cut down on the time needed to start servers.
Right now if you have 2 servers each with 2 processes then you perform a
bunch of actions 4 times, one for each process.
For instance, the same service is read and parsed 4 times from the file
system. That can be sped up/cached and made reusable after first process
does what it needs to do. This can definitely save up a lot - consider
that there are 250+ of internal services alone.
Then you will likely have something akin to:
$ zato start /path/to/server1 /path/to/server2 /path/to/server3
And now Zato will know that anything that was accomplished by server1
should carry over to server2 and server3.