The second server does not start


#1

Good day,
Can not start two servers.
At first everything starts, but when you start adding services, the second one falls down and no longer rises.
The errors are different but everything revolves around the service deployment function: deploy_missing_services (zato-server/src/zato/server/base/parallel/init.py)
Errors such, in very rare cases, can even start:

  • TimeoutError: QueuePool limit of size 1 overflow 10 reached, connection timed out, timeout 30
    Increasing the parameter pool_size to 100, like it stops popping up, but maybe it’s something else
  • InvalidRequestError: This session is in ‘prepared’ state; No further SQL can be emitted within this transaction.
  • ResourceClosedError: This Connection is closed

Zato 3.0.0rc1+rev.4d6e14d6


#2

Hello,

I am getting the above mentioned error as well, whenever I add services… one of the servers will shut down saying ‘QueuePool limit of size 1 overflow 10 reached’. I have observed the ResourceClosedError as well.

Stack trace is as below:

File "/home/zato/zato/code/eggs/gunicorn-18.0-py2.7.egg/gunicorn/arbiter.py", line 495, in spawn_worker
    self.cfg.post_fork(self, worker)
  File "/home/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 538, in post_fork
    ParallelServer.start_server(worker.app.zato_wsgi_app, arbiter.zato_deployment_key)
  File "/home/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 401, in start_server
    self._after_init_accepted(locally_deployed)
  File "/home/zato/zato/code/zato-server/src/zato/server/base/parallel/config.py", line 369, in _after_init_accepted
    self.deploy_missing_services(locally_deployed)
  File "/home/zato/zato/code/zato-server/src/zato/server/base/parallel/__init__.py", line 173, in deploy_missing_services
    msg.package_id = hot_deploy(self, file_name, full_path, notify=False)
  File "/home/zato/zato/code/zato-common/src/zato/common/util.py", line 618, in hot_deploy
    now, di, file_name, open(path, 'rb').read(), parallel_server.id)
  File "/home/zato/zato/code/zato-common/src/zato/common/odb/api.py", line 563, in hot_deploy
    filter(Cluster.id == self.server.cluster_id).\
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/query.py", line 2398, in one
    ret = list(self)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/query.py", line 2440, in __iter__
    self.session._autoflush()
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1264, in _autoflush
    self.flush()
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1985, in flush
    self._flush(objects)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 2103, in _flush
    transaction.rollback(_capture_exception=True)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/util/langhelpers.py", line 60, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 2067, in _flush
    flush_context.execute()
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/unitofwork.py", line 372, in execute
    rec.execute(self)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/unitofwork.py", line 526, in execute
    uow
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/persistence.py", line 46, in save_obj
    uowtransaction)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/persistence.py", line 141, in _organize_states_for_save
    states):
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/persistence.py", line 849, in _connections_for_states
    base_mapper)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 232, in connection
    return self._connection_for_bind(bind, execution_options)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 315, in _connection_for_bind
    conn = self._parent._connection_for_bind(bind, execution_options)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 326, in _connection_for_bind
    conn = bind.contextual_connect()
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1910, in contextual_connect
    self.pool.connect(),
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/pool.py", line 338, in connect
    return _ConnectionFairy._checkout(self)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/pool.py", line 645, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/pool.py", line 440, in checkout
    rec = pool._do_get()
  File "/home/zato/zato/code/eggs/SQLAlchemy-0.9.9-py2.7-linux-x86_64.egg/sqlalchemy/pool.py", line 960, in _do_get
    (self.size(), self.overflow(), self._timeout))
TimeoutError: QueuePool limit of size 1 overflow 10 reached, connection timed out, timeout 30
2017-10-12 12:45:45,321 - INFO - 19586:MainThread - gunicorn.main:176 - Worker exiting (pid: 19586)

Zato 3.0.0rc1+rev.4f095ca


#3

Is what you observe a new phenomenon? A sudden change?


#4

Hi @dsuch,

No, this issue was observed with the version Zato 3.0.0rc1+rev.gc59b848 as well.

The steps to reproduce this issue, as per my observations, are:

  1. Create a new Zato 3 cluster with two servers and start zato
  2. Upload more than one services through web-admin
  3. Restart Zato cluster

#5

Facing the same issue on Version: Zato 3.0.0rc1+rev.


#6

Please describe exactly what SQL database it is, how many servers there are, what OS it is, and how you dpeloy code, e.g. through web-admin only or from pickup files too? Is it enough only to deploy from web-admin? How many times?

Thanks.


#7

@dsuch OP can probably answer on their setup.
But here’s ours:

  • CentOS 7
  • PostgreSQL 9.4

Services have been deployed via the web-admin.
Same methodology works when using Zato 2.8 with no issues.


#8

Thanks, I have just confirmed it with MySQL 5.1.16 and latest Zato 3.0 (one of development branches) - I will investigate it under ticket #800 in GitHub.


#9

Hello,

this is done and was merged to main - can you please try it out?

Thanks.