Creating a New cluster with 2 Physical Nodes

I should note the current version:

zato --version
Zato 3.1+rev.8532cda-py3.6.8

Hi @dalord,

yes, naturally, it is possible to run Zato servers on any number of nodes, physical or not.

Can you please reformat your message to include tracebacks in the ``` blocks? That is three backticks followed by a traceback and then three backticks again, like in Markdown.

Also, please send in server.log files from both servers.

Thank you.

I fixed the formatting. Here are the links to the logs:

Server 1:
https://pastebin.com/zeEdidRb

Server 2:
https://pastebin.com/7VtNvxuG

OK but what were the IPs that the servers were starting under? I have not checked it all in these circumstances, have you edited out anything else?

Due to security I had to pull out IP addresses, but server one starts up on .35.50 and server 2 is 35.48. The only other thing I changed was removing the hostname as well from server name so they are really name something like zato01.example.com-server1 and zato02.example.com-server2.

Hello @dalord,

can you tell me what happens if you start server2 when server1 is not running? Does it also happen?

Can you also send across the output from this command, for both servers?

ls -la /path/to/server/config/repo

Same error when 01 is not running. Here is the output:

[(test) zato@zato0 logs]$ ls -la /opt/zato/cluster/server1/config/repo/
total 176
drwxrwxr-x 9 zato zato   4096 Jun 15 07:46 .
drwxrwxr-x 3 zato zato     18 Jun 12 10:54 ..
drwxrwxr-x 8 zato zato    166 Jun 15 07:46 .git
-rw-rw-r-- 1 zato zato 104608 Jun 12 10:55 internal-cache.dat
-rw-rw-r-- 1 zato zato   5597 Jun 12 10:54 logging.conf
drwxrwxr-x 4 zato zato     34 Jun 12 10:54 lua
-rw-rw-r-- 1 zato zato    857 Jun 12 10:54 pickup.conf
drwxrwxr-x 3 zato zato     18 Jun 12 10:54 schema
-rw-rw-r-- 1 zato zato    788 Jun 12 10:54 secrets.conf
-rw-rw-r-- 1 zato zato  10884 Jun 15 07:46 server.conf
-rw-rw-r-- 1 zato zato    484 Jun 12 10:54 service-sources.txt
drwxrwxr-x 3 zato zato     21 Jun 12 10:54 sftp
-rw-rw-r-- 1 zato zato    273 Jun 12 10:54 simple-io.conf
-rw-rw-r-- 1 zato zato    629 Jun 12 10:54 sql.conf
-rw-rw-r-- 1 zato zato   1802 Jun 12 10:54 sso.conf
drwxrwxr-x 4 zato zato     36 Jun 12 10:54 static
drwxrwxr-x 4 zato zato     40 Jun 12 10:54 tls
drwxrwxr-x 2 zato zato      6 Jun 12 10:54 user-conf
-rw-rw-r-- 1 zato zato     63 Jun 12 10:54 user.conf
-rw-rw-r-- 1 zato zato   6351 Jun 12 10:54 zato-server-ca-certs.pem
-rw-rw-r-- 1 zato zato   1976 Jun 12 10:54 zato-server-cert.pem
-rw-rw-r-- 1 zato zato   1675 Jun 12 10:54 zato-server-priv-key.pem
-rw-rw-r-- 1 zato zato    451 Jun 12 10:54 zato-server-pub-key.pem

And for server2

[(test) zato@zato02 logs]$ ls -la /opt/zato/cluster/server2/config/repo/
total 72
drwxrwxr-x 9 zato zato  4096 Jun 17 12:32 .
drwxrwxr-x 3 zato zato    18 Jun 12 11:00 ..
drwxrwxr-x 8 zato zato   166 Jun 17 12:32 .git
-rw-rw-r-- 1 zato zato     0 Jun 17 12:32 internal-cache.dat
-rw-rw-r-- 1 zato zato  5597 Jun 12 11:00 logging.conf
drwxrwxr-x 4 zato zato    34 Jun 12 11:00 lua
-rw-rw-r-- 1 zato zato   857 Jun 12 11:00 pickup.conf
drwxrwxr-x 3 zato zato    18 Jun 12 11:00 schema
-rw-rw-r-- 1 zato zato   788 Jun 12 11:00 secrets.conf
-rw-rw-r-- 1 zato zato 10885 Jun 12 13:46 server.conf
-rw-rw-r-- 1 zato zato   484 Jun 12 11:00 service-sources.txt
drwxrwxr-x 3 zato zato    21 Jun 12 11:00 sftp
-rw-rw-r-- 1 zato zato   273 Jun 12 11:00 simple-io.conf
-rw-rw-r-- 1 zato zato   629 Jun 12 11:00 sql.conf
-rw-rw-r-- 1 zato zato  1802 Jun 12 11:00 sso.conf
drwxrwxr-x 4 zato zato    36 Jun 12 11:00 static
drwxrwxr-x 4 zato zato    40 Jun 12 11:00 tls
drwxrwxr-x 2 zato zato     6 Jun 12 11:00 user-conf
-rw-rw-r-- 1 zato zato    63 Jun 12 11:00 user.conf
-rw-rw-r-- 1 zato zato  6351 Jun 12 11:00 zato-server-ca-certs.pem
-rw-rw-r-- 1 zato zato  1976 Jun 12 11:00 zato-server-cert.pem
-rw-rw-r-- 1 zato zato  1675 Jun 12 11:00 zato-server-priv-key.pem
-rw-rw-r-- 1 zato zato   451 Jun 12 11:00 zato-server-pub-key.pem

It should be noted that I have removed the internal-cache.dat file several times during diagnosing so that it tries to rebuild it. I only did this after it failed the first couple of times trying to come up.

Hm… The EOFError that you reported initially seems to be related to the fact that internal-cache.dat is of length zero - the code assumes that this file either does not exist or, when it does, that it contains what it ought to.

Just so I confirm it - when you delete the file it is created back but as an empty one? Is that right?

That is correct. It get’s recreated and if I fire up the server again I get:

2020-06-18 07:07:08,685 - INFO - 14909:MainThread - zato:0 - Starting Zato 3.1+rev.8532cda-py3.6.8
2020-06-18 07:07:08,695 - INFO - 14909:MainThread - zato:0 - Listening at: http://0.0.0.0:17010 (14909)
2020-06-18 07:07:08,695 - INFO - 14909:MainThread - zato:0 - Using worker: gevent
2020-06-18 07:07:08,701 - INFO - 14933:MainThread - zato:0 - Booting worker with pid: 14933
2020-06-18 07:07:09,573 - INFO - 14933:MainThread - zato.server.base.parallel:0 - Preferred address of `server2@test` (pid: 14933) is `http://<ip>.35.48:17010`
2020-06-18 07:07:11,500 - INFO - 14933:MainThread - zato.server.service.store:0 - Deploying cached internal services (server2)
2020-06-18 07:07:11,504 - ERROR - 14933:MainThread - zato:0 - Exception in worker process
Traceback (most recent call last):
  File "/opt/zato/zato/code/zato-server/src/zato/server/ext/zunicorn/arbiter.py", line 616, in spawn_worker
    self.cfg.post_fork(self, worker)
  File "/opt/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 1038, in post_fork
    ParallelServer.start_server(worker.app.zato_wsgi_app, arbiter.zato_deployment_key)
  File "/opt/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 584, in start_server
    is_first, locally_deployed = self._after_init_common(server)
  File "/opt/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 439, in _after_init_common
    is_first, locally_deployed = self.maybe_on_first_worker(server, self.kvdb.conn)
  File "/opt/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 361, in maybe_on_first_worker
    locally_deployed = import_initial_services_jobs(is_first)
  File "/opt/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 318, in import_initial_services_jobs
    internal_service_modules, self.base_dir, self.sync_internal, is_first))
  File "/opt/zato/zato/code/zato-server/src/zato/server/service/store.py", line 637, in import_internal_services
    dill_items = dill_load(f)
  File "/opt/zato/current/lib/python3.6/site-packages/dill/dill.py", line 250, in load
    obj = pik.load()
  File "/usr/lib64/python3.6/pickle.py", line 1048, in load
    raise EOFError
EOFError
2020-06-18 07:07:11,508 - INFO - 14933:MainThread - zato:0 - Worker exiting (pid: 14933)
2020-06-18 07:07:11,509 - INFO - 14933:MainThread - zato:0 - Closing IPC (/pubsub/pid)
2020-06-18 07:07:11,510 - INFO - 14933:MainThread - zato:0 - Closing IPC (/connector/config)
2020-06-18 07:07:11,510 - INFO - 14933:MainThread - zato.server.base.parallel:0 - Stopping server process (zato02.test.ostk.com-server2:14933) (14933)

Thanks. I cannot reproduce it but I think I should be able to push a change that will cover such a case - I will let you know soon.

You are on Zato 3.1, installed from a .deb or an .rpm, right? This is not an installation from source? I just want to ensure the change goes to correct git branches.

I couldn’t get the RPM install to work for me so I was installed from source (git pull).

Can you say more about your inability to install from an RPM? Which RPM was it? Is there a GitHub ticket for it?

Also, what git branch have you installed it from?

Thanks.

I don’t have a GitHub ticket for it as I thought it may have been the internal image that we are building on (we don’t have access to outside repos due to security). But I was simply pulling it from the ones listed in the documentation.

Here is the clone command I used to install the latest version:
git clone https://github.com/zatosource/zato

Here are the results of the one we had added to our internal repo:

Name        : zato
Arch        : x86_64
Version     : 3.1.0
Release     : python3.el7
Size        : 152 M
Repo        : zato-3.1
Summary     : The next generation ESB and application server. Open-source. In Python
URL         : http://zato.io
License     : LGPL
Description : Zato - ESB, SOA, REST, APIs and Cloud Integrations in Python

I am working on that change as we speak - I will let you know as soon as possible.

When you go the directory that you installed Zato from source to, can you please tell me what “git status && git log -1” return?

Yes, installing Zato without access to the public Internet is certainly possible.

We work with clients covered by the commercial support on such customisations or installation procedures and there is no problem to have this kind of a setup.

One more question.

What I believe is happening is this:

  • The server starts
  • It tries to deploy internal services
  • Having done that, it tries to cache the deployed internal services so that the next time it starts up, it will take less time to deploy them
  • It opens file “internal-cache.dat” for writing
  • There is an error in writing out data to that file
  • The file is created with size zero
  • Later on, when the file is accessed for reading, the EOFError is raised because it is of size zero

However, I do not understand one part - is EOFError the only exception that you see now?

I mean, there still should be some kind of an error raised like the one that you saw initially “AttributeError: module ‘gevent._greenlet’ has no attribute.

# On branch main
nothing to commit, working directory clean
commit 8532cda8d1a4e8c9fc00db53eee947629fc2a863
Author: Dariusz Suchojad <dsuch-github@m.zato.io>
Date:   Wed Jun 3 11:13:39 2020 +0200

    GH #1052 - Adding a dark theme to API documentation generator.

I only receive this the first time it fires up without the internal-cache.dat file. Every time after the output is as shown the couple of posts above this.

I have pushed a commit to delete the file if it is empty on server startup - can you please git pull the latest changes, try it out and send in the contents of server.log afterwards? Thanks.

https://pastebin.com/sfbsZi6U