Hey guys, first thanks for the awesome tool, Zato was exactly what I needed and with a quick learning curve to make me able to create a proof of concept for my management, which is working exceptionally well. Now I am deploying this in a pre-production environment, with focus on high availability.
My current scenario is 3 bare metal RHEL6 machines on a local network, behind an external physical load balancer (not haproxy).
My priority is to make zato as HA as possible. My services usually are dependent on data which is saved between executions (so not exactly stateless). This led me to try to make an archictecture with a single zato server on each machine, all on the same zato cluster.
The Zato Redis HA page was perfect and led me to the Redis HA perfectly. This is what is missing:
1) ODB HA
I had a huge pain when dealing with the ODB HA. If I choose sqlite, I understand I won’t be able to have a single database visible for all 3 servers (since sharing them on a NFS is not recommended).
Choosing Postgres or Mysql are also a pain, since most of the native configuration for both solutions only cover replication. You need additional tools to manage the service VIP and failover.
It seems too much of a hassle for such a simple environment. Am I missing something? Is there another Zato architecture I can use to avoid dealing with the ODB HA? If the current arch is the best one for my case, can someone recommend a name of a couple of tools which are easier/simples to install/manage to achieve this?
For now I have a single Mysql server, no replication/failover at all.
2) Zato load balancer
If I only have one server per machine do I really need the load balancer (haproxy) component? I saw some functions of the web gui stop working when I don’t have the LB active but other than that can I just shut them off permanently (or even remove them) for my environment and perform the balancing on my real LB?
3) Zato number of servers
Do I gain something by creating more servers in a single machine except performance? Since it’s a bare metal machine, isn’t it the same as configuring a single server with more workers?
I think that’s it for now. Thanks again for this exceptional tool.
you are saying that you are sharing state between invocations of services so from that point alone it would follow that you would need some persistent storage and it looks you will not get away from using a database of some sort.
I understand the desire to cut down on the number of components needed and I can report that there is, for instance, a series of works to make Zato not depend on Redis for the next release (support yes, depend no).
As for the SQL database - we simply need to keep our configuration and internal state somewhere.
Some server architectures use clusters with internal communication which means no external database which on the face of it sounds very useful but this also has its drawbacks.
In most scenarios, servers need to communicate with each other directly and this is not always possible. With 3 of them, supporting a transactional business system this will not be an issue, but consider a monitoring solution with a few dozen of servers each dedicated to a different customer where it is not allowed to open this kind of traffic, perhaps for legal reasons.
In current architecture, with 3 servers for instance, the cluster will continue to work even if there is only 1 server available. But if we were to keep configuration in servers directly, we would need to employ a consensus algorithm, such as Raft, to make sure the state is consistent across the cluster. This would do away with SQL ODB but in turn would require for the majority of servers to be always up and running. With 3 servers, 2 of them would always have to be available, with only 1 the whole cluster would refuse to work.
Raft, as great as it is, is an algorithm that deals only with distributing the state in the cluster and in particular - it assumes there is a notion of the main node (leader) that accepts all requests and coordinates updates with the remaining nodes (followers). This does not play nicely with load-balancers whose job is, after all, to distribute the traffic evenly across all servers. This also does not fit nicely in scenarios where your clients are, for instance, thousands of small IoT devices connected to Zato through long-running WebSocket connections, each to a specific server.
Raft is not the only such algorithm of course, we could as well think of using a three-phase commit with its own upsides and limitations, but the point is that there are ramifications to everything and when you take it all into account, you, as an architect, will need to think about HA for the configuration and state DB anyhow in one way or another.
That all said, I am in fact considering the idea of using one of the consensus algorithms in Zato at one point but that would surely be in addition to SQL ODB, not instead of it, i.e. with users being able to pick what they prefer with understanding of implications.
As for using your own load-balancer - this will work with the caveat that LB-related pages in web-admin for obvious reasons will not work so you will not be able to update configuration of HAProxy if there is no HAProxy in the first place. Except for that, web-admin as such will continue to work, you will just need to point it to your LB instead of the default one in order for it to find the servers.
The number of servers per OS doesn’t matter as far as performance goes as long as you follow the rule that there is at most 1 worker process for each CPU. Let’s say you have 4 CPUs in each OS. If you start 1 server with 4 processes or 4 servers each with 1 process, there will be no difference in performance assuming that there is an external load-balancer to fairly distribute requests to either that one server or 4 of them.
About the only difference is in maintenance when you can shut down servers one by one without significantly impacting operations if there is more than 1 server in the system but you are using 3 computers so this will not be a real concern. I can also think about edge cases such as a need to bind each server to a separate IP address but this is something that you would know best yourself and even then, it wouldn’t influence performance in terms of transactions/s.
Thanks for the excelent and detailed response, @dsuch. I really appreciate it.
I understand all your points and agree with them. I persist my data using Redis itself, so I would not need to have an additional database if not for ODB which would simplify my topology by a lot.
I was asked by me tech manager why do we need both a relational and another non-relational DB for this architecture and this made me though maybe I was over complicating things. If this is not the case, no problem. I will just think and research harder which are my options to achieve the ODB HA in the simplest way for my environment. I understand this is out of scope of Zato but I’m open to suggestions on easier ways to achieve this, if you can point me in the right direction, since MHA seems to work but has a lot of manual steps after a failover for me. Galera for Mysql/MariaDB should work, right?
On the LB I am already working directly with the server without any issue, thanks for the support on that.
New question: multiple web-admins. Today I have a single web-admin installed. Is there any problem for me installing additional web-admins (one per machine), so I can achieve HA on this front as well, in case my main machine goes down?
I would need to research the options myself before answering - as it happens, most of our (Zato Source) projects took place using Amazon infrastructure where many things are built in.
As for multiple web-admins, yes, this will work, there is nothing preventing you from installing redundant instances.
If you are looking for a DB HA you should look in the directions of distributed databases solutions like Apache Cassandra(already supported in Zato) or CoreOS Etcd that you could also use as an API REST and it will fit ok into Zato or your architecture.
I don’t think that you need HA for the Zato internal ODB since that is mostly a configuration Database, at least that you are using it with third party app to “dynamically” change ODB state or you need to query for Zato data, that you could also get it using the Zato services
Just my 2 cents