I have been detecting that many times either of the servers is down after starting the container. Sometimes it’s server1 (most commonly), sometimes it’s server2. Sometimes both are up. I’ve never seen both down upon start-up.
So I am going to leave here a number of cases and see if, when put together, something makes sense.
jsabater@kasumi:~$ docker start --interactive 511a3594b3dc
2018-10-17 07:59:30,224 CRIT Set uid to user 0
2018-10-17 07:59:30,233 INFO supervisord started with pid 7
2018-10-17 07:59:31,235 INFO spawned: 'postgresql' with pid 10
2018-10-17 07:59:31,238 INFO spawned: 'zato-web-admin' with pid 11
2018-10-17 07:59:31,239 INFO spawned: 'redis' with pid 12
2018-10-17 07:59:31,240 INFO spawned: 'zato-server1' with pid 13
2018-10-17 07:59:31,241 INFO spawned: 'zato-server2' with pid 15
2018-10-17 07:59:31,242 INFO spawned: 'ssh' with pid 17
2018-10-17 07:59:31,244 INFO spawned: 'zato-scheduler' with pid 18
2018-10-17 07:59:31,245 INFO spawned: 'zato-load-balancer' with pid 19
2018-10-17 07:59:31,248 DEBG fd 6 closed, stopped monitoring <POutputDispatcher at 140619366388944 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stdout)>
2018-10-17 07:59:31,248 DEBG 'postgresql' stderr output:
postgresql: unrecognized service
2018-10-17 07:59:31,248 DEBG fd 8 closed, stopped monitoring <POutputDispatcher at 140619366388584 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stderr)>
2018-10-17 07:59:31,248 INFO exited: postgresql (exit status 1; not expected)
2018-10-17 07:59:31,249 DEBG received SIGCLD indicating a child quit
2018-10-17 07:59:31,253 DEBG 'redis' stdout output:
12:C 17 Oct 07:59:31.252 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
2018-10-17 07:59:31,254 DEBG 'redis' stdout output:
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.0.6 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 12
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
12:M 17 Oct 07:59:31.254 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
12:M 17 Oct 07:59:31.254 # Server started, Redis version 3.0.6
12:M 17 Oct 07:59:31.254 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
12:M 17 Oct 07:59:31.254 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
2018-10-17 07:59:31,264 DEBG 'redis' stdout output:
12:M 17 Oct 07:59:31.264 * DB loaded from disk: 0.010 seconds
12:M 17 Oct 07:59:31.264 * The server is now ready to accept connections on port 6379
2018-10-17 07:59:32,266 INFO spawned: 'postgresql' with pid 44
2018-10-17 07:59:32,267 INFO success: zato-web-admin entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,267 INFO success: redis entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,267 INFO success: zato-server1 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,267 INFO success: zato-server2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,267 INFO success: ssh entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,267 INFO success: zato-scheduler entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,267 INFO success: zato-load-balancer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-17 07:59:32,294 DEBG 'postgresql' stderr output:
postgresql: unrecognized service
2018-10-17 07:59:32,294 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 140619366388584 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stderr)>
2018-10-17 07:59:32,294 DEBG fd 6 closed, stopped monitoring <POutputDispatcher at 140619366388800 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stdout)>
2018-10-17 07:59:32,294 INFO exited: postgresql (exit status 1; not expected)
2018-10-17 07:59:32,294 DEBG received SIGCLD indicating a child quit
2018-10-17 07:59:34,307 INFO spawned: 'postgresql' with pid 98
2018-10-17 07:59:34,342 DEBG 'postgresql' stderr output:
postgresql: unrecognized service
2018-10-17 07:59:34,342 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 140619366388584 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stderr)>
2018-10-17 07:59:34,342 DEBG fd 6 closed, stopped monitoring <POutputDispatcher at 140619366389088 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stdout)>
2018-10-17 07:59:34,342 INFO exited: postgresql (exit status 1; not expected)
2018-10-17 07:59:34,342 DEBG received SIGCLD indicating a child quit
2018-10-17 07:59:35,642 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,642 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_add_remove_server of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [add_remove_server]
2018-10-17 07:59:35,643 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,643 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_execute_command of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [execute_command]
2018-10-17 07:59:35,643 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,643 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_get_config of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [get_config]
2018-10-17 07:59:35,643 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,643 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_get_config_source_code of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [get_config_source_code]
2018-10-17 07:59:35,643 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,643 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_get_server_data_dict of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [get_server_data_dict]
2018-10-17 07:59:35,644 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,643 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_get_servers_state of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [get_servers_state]
2018-10-17 07:59:35,644 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,644 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_get_uptime_info of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [get_uptime_info]
2018-10-17 07:59:35,644 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,644 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_get_work_config of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [get_work_config]
2018-10-17 07:59:35,644 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,644 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_haproxy_version_info of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [haproxy_version_info]
2018-10-17 07:59:35,644 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,644 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_is_haproxy_alive of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [is_haproxy_alive]
2018-10-17 07:59:35,645 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,644 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_ping of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [ping]
2018-10-17 07:59:35,645 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,645 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_rename_server of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [rename_server]
2018-10-17 07:59:35,645 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,645 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_validate_save of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [validate_save]
2018-10-17 07:59:35,645 DEBG 'zato-load-balancer' stdout output:
2018-10-17 07:59:35,645 - INFO - 85:MainThread - root:114 - Registering [<bound method LoadBalancerAgent._lb_agent_validate_save_source_code of <zato.agent.load_balancer.server.LoadBalancerAgent object at 0x7fbc6c334e50>>] under public name [validate_save_source_code]
2018-10-17 07:59:36,166 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:36,166 - INFO - 70:MainThread - gunicorn.main:176 - Starting gunicorn 18.0
2018-10-17 07:59:36,170 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:36,169 - INFO - 70:MainThread - gunicorn.main:176 - Listening at: http://0.0.0.0:17011 (70)
2018-10-17 07:59:36,170 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:36,170 - INFO - 70:MainThread - gunicorn.main:176 - Using worker: gevent
2018-10-17 07:59:36,175 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:36,174 - INFO - 132:MainThread - gunicorn.main:176 - Booting worker with pid: 132
2018-10-17 07:59:36,199 DEBG 'zato-server1' stderr output:
No handlers could be found for logger "zato.common.util.proc"
2018-10-17 07:59:36,277 DEBG fd 20 closed, stopped monitoring <POutputDispatcher at 140619366391104 for <Subprocess at 140619366388224 with name zato-server1 in state RUNNING> (stdout)>
2018-10-17 07:59:36,277 DEBG fd 24 closed, stopped monitoring <POutputDispatcher at 140619366391608 for <Subprocess at 140619366388224 with name zato-server1 in state RUNNING> (stderr)>
2018-10-17 07:59:36,277 INFO exited: zato-server1 (exit status 18; not expected)
2018-10-17 07:59:36,277 DEBG received SIGCLD indicating a child quit
2018-10-17 07:59:36,679 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:36,679 - INFO - 132:MainThread - zato.server.base.parallel:421 - Preferred address of `server2@quickstart-1000480` (pid: 132) is `http://172.17.0.1:17011`
2018-10-17 07:59:37,681 INFO spawned: 'postgresql' with pid 148
2018-10-17 07:59:37,687 DEBG 'postgresql' stderr output:
postgresql: unrecognized service
2018-10-17 07:59:37,687 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 140619366391752 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stderr)>
2018-10-17 07:59:37,687 DEBG fd 6 closed, stopped monitoring <POutputDispatcher at 140619366391960 for <Subprocess at 140619366735528 with name postgresql in state STARTING> (stdout)>
2018-10-17 07:59:37,687 INFO exited: postgresql (exit status 1; not expected)
2018-10-17 07:59:37,687 DEBG received SIGCLD indicating a child quit
2018-10-17 07:59:38,688 INFO gave up: postgresql entered FATAL state, too many start retries too quickly
2018-10-17 07:59:59,225 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,224 - INFO - 132:MainThread - zato.server.base.parallel:187 - Found extra services to deploy: booking.cancel, booking.cancel, booking.create, booking.create, booking.delete, booking.delete, booking.get, booking.get, booking.list, booking.list, booking.locate, booking.locate, booking.restore, booking.restore, booking.update, booking.update, guest.bookings, guest.bookings, guest.create, guest.create, guest.delete, guest.delete, guest.get, guest.get, guest.list, guest.list, guest.restore, guest.restore, guest.update, guest.update, login.create, login.create, login.delete, login.delete, login.get, login.get, login.list, login.list, login.update, login.update, login.validate, login.validate, rate.create, rate.create, rate.delete, rate.delete, rate.get, rate.get, rate.list, rate.list, rate.update, rate.update, room.create, room.create, room.delete, room.delete, room.get, room.get, room.list, room.list, room.update, room.update
2018-10-17 07:59:59,291 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,291 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 07:59:59,311 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,311 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 07:59:59,608 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,608 - INFO - 132:MainThread - zato.hot-deploy.create:156 - Uploaded package id:`252`, payload_name:`login.py`
2018-10-17 07:59:59,610 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,609 - INFO - 132:MainThread - zato.server.base.parallel:226 - Deployed extra services found: [u'login.create', u'login.create', u'login.delete', u'login.delete', u'login.get', u'login.get', u'login.list', u'login.list', u'login.update', u'login.update', u'login.validate', u'login.validate']
2018-10-17 07:59:59,639 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,639 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 07:59:59,659 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,658 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 07:59:59,901 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,900 - INFO - 132:MainThread - zato.hot-deploy.create:156 - Uploaded package id:`253`, payload_name:`room.py`
2018-10-17 07:59:59,902 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,902 - INFO - 132:MainThread - zato.server.base.parallel:226 - Deployed extra services found: [u'room.create', u'room.create', u'room.delete', u'room.delete', u'room.get', u'room.get', u'room.list', u'room.list', u'room.update', u'room.update']
2018-10-17 07:59:59,930 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,929 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 07:59:59,949 DEBG 'zato-server2' stdout output:
2018-10-17 07:59:59,949 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 08:00:00,193 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,192 - INFO - 132:MainThread - zato.hot-deploy.create:156 - Uploaded package id:`254`, payload_name:`rate.py`
2018-10-17 08:00:00,194 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,194 - INFO - 132:MainThread - zato.server.base.parallel:226 - Deployed extra services found: [u'rate.create', u'rate.create', u'rate.delete', u'rate.delete', u'rate.get', u'rate.get', u'rate.list', u'rate.list', u'rate.update', u'rate.update']
2018-10-17 08:00:00,225 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,225 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 08:00:00,245 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,245 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 08:00:00,618 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,617 - INFO - 132:MainThread - zato.hot-deploy.create:156 - Uploaded package id:`255`, payload_name:`booking.py`
2018-10-17 08:00:00,619 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,618 - INFO - 132:MainThread - zato.server.base.parallel:226 - Deployed extra services found: [u'booking.cancel', u'booking.cancel', u'booking.create', u'booking.create', u'booking.delete', u'booking.delete', u'booking.get', u'booking.get', u'booking.list', u'booking.list', u'booking.locate', u'booking.locate', u'booking.restore', u'booking.restore', u'booking.update', u'booking.update']
2018-10-17 08:00:00,649 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,648 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 08:00:00,668 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:00,668 - INFO - 132:MainThread - zato.hot-deploy.create:376 - Creating tar archive
2018-10-17 08:00:01,007 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:01,007 - INFO - 132:MainThread - zato.hot-deploy.create:156 - Uploaded package id:`256`, payload_name:`guest.py`
2018-10-17 08:00:01,009 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:01,008 - INFO - 132:MainThread - zato.server.base.parallel:226 - Deployed extra services found: [u'guest.bookings', u'guest.bookings', u'guest.create', u'guest.create', u'guest.delete', u'guest.delete', u'guest.get', u'guest.get', u'guest.list', u'guest.list', u'guest.restore', u'guest.restore', u'guest.update', u'guest.update']
2018-10-17 08:00:01,035 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:01,034 - INFO - 132:MainThread - zato.server.base.parallel:477 - First worker of `server2` is 132
2018-10-17 08:00:01,047 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:01,047 - INFO - 132:DummyThread-3 - zato.helpers.input-logger:885 - {u'impl_name': u'zato.server.service.internal.helpers.InputLogger', u'name': u'zato.helpers.input-logger', u'cid': '5c1f3a5dafb46e6cac03db8c', u'invocation_time': datetime.datetime(2018, 10, 17, 8, 0, 1, 46627), u'job_type': None, u'data_format': None, u'slow_threshold': 99999, u'request.payload': 'Sample payload for a startup service (first worker)', u'wsgi_environ': {u'zato.request_ctx.async_msg': Bunch(action='101802', channel='startup-service', cid='5c1f3a5dafb46e6cac03db8c', msg_type='0001', payload='Sample payload for a startup service (first worker)', service='zato.helpers.input-logger'), u'zato.request_ctx.fanout_cid': None, u'zato.request_ctx.in_reply_to': None, u'zato.request_ctx.parallel_exec_cid': None}, u'environ': {}, u'usage': 62, u'channel': 'startup-service'}
2018-10-17 08:00:01,247 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:01,247 - INFO - 132:DummyThread-4 - zato.common.ipc.subscriber.Subscriber:100 - Established sub/bind to ipc:///tmp/zato-ipc-quickstart_1000480_server2_132 (self.pid: 132)
2018-10-17 08:00:01,446 DEBG 'zato-server2' stdout output:
2018-10-17 08:00:01,446 - INFO - 132:MainThread - zato.server.base.parallel:513 - Started `server2@quickstart-1000480` (pid: 132)
2018-10-17 08:00:04,094 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:04,093 - INFO - 168:MainThread - __main__:61 - Scheduler starting (https://0.0.0.0:31530)
2018-10-17 08:00:04,639 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:04,638 - INFO - 168:DummyThread-1 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19262e50; cb_kwargs:`{u'service': u'zato.stats.summary.create-summary-by-week', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`2`; interval:`<Interval at 0x7fab19279a50; days:`0`; hours:`0`; in_seconds:`600.0`; minutes:`10`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.summary.create-summary-by-week'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 7, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:04,840 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:04,840 - INFO - 168:DummyThread-2 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279210; cb_kwargs:`{u'service': u'zato.outgoing.sql.auto-ping', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`6`; interval:`<Interval at 0x7fab19279290; days:`0`; hours:`0`; in_seconds:`180.0`; minutes:`3`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.outgoing.sql.auto-ping'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 2, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:05,042 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:05,042 - INFO - 168:DummyThread-3 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab192794d0; cb_kwargs:`{u'service': u'zato.stats.aggregate-by-day', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`10`; interval:`<Interval at 0x7fab19008d10; days:`0`; hours:`0`; in_seconds:`3600.0`; minutes:`60`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.aggregate-by-day'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 47, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:05,245 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:05,244 - INFO - 168:DummyThread-4 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279690; cb_kwargs:`{u'service': u'zato.stats.aggregate-by-hour', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`3`; interval:`<Interval at 0x7fab19279350; days:`0`; hours:`0`; in_seconds:`600.0`; minutes:`10`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.aggregate-by-hour'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 7, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:05,447 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:05,446 - INFO - 168:DummyThread-5 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279710; cb_kwargs:`{u'service': u'zato.stats.aggregate-by-minute', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`8`; interval:`<Interval at 0x7fab19279410; days:`0`; hours:`0`; in_seconds:`60.0`; minutes:`0`; seconds:`60`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.aggregate-by-minute'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 0, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:05,649 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:05,648 - INFO - 168:DummyThread-6 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279810; cb_kwargs:`{u'service': u'zato.stats.aggregate-by-month', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`7`; interval:`<Interval at 0x7fab19279590; days:`0`; hours:`0`; in_seconds:`3600.0`; minutes:`60`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.aggregate-by-month'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 47, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:05,851 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:05,850 - INFO - 168:DummyThread-7 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279990; cb_kwargs:`{u'service': u'zato.stats.process-raw-times', u'extra': 'max_batch_size=99999'}`; cron_definition:`None`; current_run:`0`; id:`9`; interval:`<Interval at 0x7fab192796d0; days:`0`; hours:`0`; in_seconds:`90.0`; minutes:`0`; seconds:`90`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.process-raw-times'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 1, 8)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:06,053 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:06,052 - INFO - 168:DummyThread-8 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279b10; cb_kwargs:`{u'service': u'zato.stats.summary.create-summary-by-day', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`4`; interval:`<Interval at 0x7fab19279750; days:`0`; hours:`0`; in_seconds:`600.0`; minutes:`10`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.summary.create-summary-by-day'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 7, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:06,255 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:06,255 - INFO - 168:DummyThread-9 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19279bd0; cb_kwargs:`{u'service': u'zato.stats.summary.create-summary-by-month', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`1`; interval:`<Interval at 0x7fab192798d0; days:`0`; hours:`0`; in_seconds:`3600.0`; minutes:`60`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.summary.create-summary-by-month'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 47, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:06,458 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:06,457 - INFO - 168:DummyThread-10 - zato.scheduler.backend:264 - Job starting `<Job at 0x7fab19284e90; cb_kwargs:`{u'service': u'zato.stats.summary.create-summary-by-year', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`5`; interval:`<Interval at 0x7fab19262e10; days:`0`; hours:`0`; in_seconds:`3600.0`; minutes:`60`; seconds:`0`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.summary.create-summary-by-year'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 47, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:06,659 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:06,658 - INFO - 168:DummyThread-11 - zato.scheduler.backend:466 - Scheduler started
2018-10-17 08:00:38,478 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:38,478 - INFO - 168:DummyThread-5 - zato.scheduler.backend:217 - Job entering main loop `<Job at 0x7fab19279710; cb_kwargs:`{u'service': u'zato.stats.aggregate-by-minute', u'extra': ''}`; cron_definition:`None`; current_run:`0`; id:`8`; interval:`<Interval at 0x7fab19279410; days:`0`; hours:`0`; in_seconds:`60.0`; minutes:`0`; seconds:`60`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.aggregate-by-minute'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 0, 38)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
2018-10-17 08:00:38,482 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:00:38,482 - INFO - 168:DummyThread-12 - zato.scheduler.backend:419 - Job executed `zato.stats.aggregate-by-minute`, `{u'interval_in_seconds': 60.0, u'name': u'zato.stats.aggregate-by-minute', u'current_run': 1, u'cid': '2c1d02567433789d416b2192', u'start_time': '2018-10-17T08:00:38', u'max_repeats_reached': False, u'max_repeats': None, u'cb_kwargs': {u'service': u'zato.stats.aggregate-by-minute', u'extra': ''}, u'type': u'interval_based', u'id': 8}`
2018-10-17 08:01:08,898 DEBG 'zato-scheduler' stdout output:
2018-10-17 08:01:08,897 - INFO - 168:DummyThread-7 - zato.scheduler.backend:217 - Job entering main loop `<Job at 0x7fab19279990; cb_kwargs:`{u'service': u'zato.stats.process-raw-times', u'extra': 'max_batch_size=99999'}`; cron_definition:`None`; current_run:`0`; id:`9`; interval:`<Interval at 0x7fab192796d0; days:`0`; hours:`0`; in_seconds:`90.0`; minutes:`0`; seconds:`90`>`; is_active:`True`; keep_running:`True`; max_repeats:`None`; max_repeats_reached:`False`; max_repeats_reached_at:`None`; name:`u'zato.stats.process-raw-times'`; old_name:`None`; start_time:`datetime.datetime(2018, 10, 17, 8, 1, 8)`; type:`u'interval_based'`; wait_iter_cb:`None`; wait_iter_cb_args:`()`; wait_sleep_time:`1`>`
Left one of the servers down:
jsabater@kasumi:~$ docker exec -u 0 -it 511a3594b3dc bash
root@kasumi:/opt/zato/env/qs-1# su - zato
zato@kasumi:~$ ps x -u zato
PID TTY STAT TIME COMMAND
11 pts/0 S 0:00 /bin/bash /opt/zato/zato_start_web_admin
12 pts/0 Sl 0:00 redis-server *:6379
15 pts/0 S 0:00 /bin/bash /opt/zato/zato_start_server2
18 pts/0 S 0:00 /bin/bash /opt/zato/zato_start_scheduler
19 pts/0 S 0:00 /bin/bash /opt/zato/zato_start_load_balancer
27 pts/0 S 0:01 /opt/zato/3.0/code/bin/python2.7 /opt/zato/current/bin/zato start /opt/zato/env/qs-1/server2 --fg
30 pts/0 S 0:01 /opt/zato/3.0/code/bin/python2.7 /opt/zato/current/bin/zato start /opt/zato/env/qs-1/web-admin --fg
31 pts/0 S 0:01 /opt/zato/3.0/code/bin/python2.7 /opt/zato/current/bin/zato start /opt/zato/env/qs-1/load-balancer --fg
69 pts/0 Sl 0:02 /opt/zato/3.0/code/bin/python -m zato.admin.main sync_internal=FalseZATO_ZATO_ZATOsecret_key=ZATO_ZATO_ZATOfg=True
70 pts/0 Sl 0:02 gunicorn: master [gunicorn]
85 pts/0 S 0:01 /opt/zato/3.0/code/bin/python -m zato.agent.load_balancer.main /opt/zato/env/qs-1/load-balancer/config/repo sync_internal=FalseZATO_ZATO_ZATOsecret_key=ZATO_ZATO_ZATOfg=True
130 ? Ss 0:00 haproxy -D -f /opt/zato/env/qs-1/load-balancer/config/repo/zato.config -p /opt/zato/env/qs-1/load-balancer/pidfile
132 pts/0 Sl 0:17 gunicorn: worker [gunicorn]
134 pts/1 S 0:00 -su
159 pts/0 S 0:01 /opt/zato/3.0/code/bin/python2.7 /opt/zato/current/bin/zato start /opt/zato/env/qs-1/scheduler --fg
168 pts/0 Sl 0:02 /opt/zato/3.0/code/bin/python -m zato.scheduler.main sync_internal=FalseZATO_ZATO_ZATOsecret_key=ZATO_ZATO_ZATOfg=True
183 pts/1 R+ 0:00 ps x -u zato
As we can see, it was server1 in this case that could not start.