No KVDB payload warnings


#1

Hi,

On a new Zato 2.0.8 instance, we are getting this warning regularly:

No KVDB payload for key [zato:broker:to-parallel:any:K07ZT79S9VWF8ER4JS0N8PDK6Q1S.tmp] (already expired?)

It does not sound lethal. However, it would be nice to get rid of these warnings. What could be the cause of these warnings?

Regards, Jan


#2

Hi @jjmurre,

do you mean regularly in a literal sense that there is a pattern of N seconds to it? I’m asking because this could be related to some internal scheduler jobs. Are you perhaps able to correlate it with entries in logs/scheduler.log? Does it appear more frequently after, say, carrying out some tasks in web-admin?

More details would be available on DEBUG logging level. That would be the ‘zato’ logger in config/repo/logging.conf.

With debug logging enabled you could catch the apparently missing Redis key as it’s being stored in Redis and see what is inside, what kind of internal message it was that was published and then expired.

Is Redis under heavy load in this environment?

Thanks.


#3

Hi @dsuch,

It is more that is happens in bursts. So this warning does not happen for a period (say 2 hours). Then it happens a couple of times (around 10 times), then it is gone again.

The zato server has a scheduled job (calling a soap service) that get lots of timeouts (timeout in outgoing soap channel is 1200 sec. and still we get timeouts). Maybe that is related.

I enabled DEBUG logging and I see a lot of these messages (but not the warning):

2017-10-05 16:00:18,402 - DEBUG - 26407:DummyThread-26 - zato.broker.client:22 - No payload in msg:[channel: /zato/to-parallel/any
data: zato:broker:to-parallel:any:K05MYDQ9ZMCEHXEHM23C634VSJGE
pattern: null
type: message
]

We also experience a problem with the scheduler. After a while (1-2 days) the Zato server stops executing scheduled jobs. Also the ‘last_keep_alive’ message disappears from the logs. We are trying to find out if the timeouts with the SOAP service are causing this, but cannot find a clear connection until now.

Is there a way to restart scheduling for a running Zato server?

Regards, Jan


#4

Jan, is all of it new with 2.0.8 after the upgrade?


#5

Yes, on 2.0.7 we did not seem to have problems with the scheduler.

Regards, Jan

PS. We switched to 2.0.8, because on Redhat Enterprise, Sentry logging would not work on Zato 2.0.7
And we also see some timeouts with Sentry (our Sentry server has a high load), so that could also be the cause of our problems.


#6

Right, I see.

There were indeed changes in 2.0.8 related to how background Redis connections work, these are what servers use internally to exchange messages, e.g. about messages published by the scheduler.

The whole of the changes are in fact these two commits below at the end of this message.

Their purpose was to prevent what you are describing - in some environments with very slow Redis connections the Redis client would drop connections and, since it was going on in its own greenlet, the whole thing would happen silently. Now the two commits re-establish the connection when it’s broken.

(As a side note, in Zato 3.0 the scheduler is in its own component started with zato start /path/to/scheduler precisely to make it clearer what entity is responsible for the jobs).

I’d like to ask for one thing, there is a command called zato info which returns information about a component, running or not, including what TCP connections this component is part of, like below.

There is a worker process with a PID of 5571 and we can find in worker_5571_connections that it has a TCP connection in state ESTABLISHED to 127.0.0.1:6379 which is Redis.

Can you confirm it on your end (on each server) that you still do or do not have a Redis connection the next time when you observe that the scheduler is not running?

$ zato info /home/dsuch/env/qs-cache1/server1
+--------------------------------+--------------------------------------------------------------------------------------------+
|              Key               |                                           Value                                            |
+================================+============================================================================================+
| component_details              | {"created_user_host": "dsuch@angmar", "version": "Zato 3.0.0rc1+rev.5c6b15ed",             |
|                                | "component": "SERVER", "created_ts": "2017-09-29T10:57:21.390132"}                         |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_full_path            | /home/dsuch/env/qs-cache1/server1                                                          |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_host                 | angmar/angmar                                                                              |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_running              | True                                                                                       |
+--------------------------------+--------------------------------------------------------------------------------------------+
| current_time                   | 2017-10-05T16:47:33.477040                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| current_time_utc               | 2017-10-05T14:47:33.477047                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_connections        | {'LISTEN': [{u'to': None, u'from': u'0.0.0.0:17010', u'formatted': u'0.0.0.0:17010'}]}     |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_create_time        | 2017-10-05T16:40:30.490000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_create_time_utc    | 2017-10-05T14:40:30.490000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_name               | gunicorn:                                                                                  |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_pid                | 5563                                                                                       |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_username           | dsuch                                                                                      |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_workers_no         | 1                                                                                          |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_workers_pids       | [5571]                                                                                     |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_5571_connections        | {'ESTABLISHED': [{u'to': u'127.0.0.1:3306', u'from': u'127.0.0.1:39916', u'formatted':     |
|                                | u'127.0.0.1:39916       ->        127.0.0.1:3306'}, {u'to': u'127.0.0.1:6379', u'from':    |
|                                | u'127.0.0.1:60336', u'formatted': u'127.0.0.1:60336       ->        127.0.0.1:6379'},      |
|                                | {u'to': u'127.0.0.1:6379', u'from': u'127.0.0.1:60338', u'formatted': u'127.0.0.1:60338    |
|                                | ->        127.0.0.1:6379'}, {u'to': u'127.0.0.1:6379', u'from': u'127.0.0.1:60334',        |
|                                | u'formatted': u'127.0.0.1:60334       ->        127.0.0.1:6379'}, {u'to':                  |
|                                | u'127.0.0.1:6379', u'from': u'127.0.0.1:60320', u'formatted': u'127.0.0.1:60320       ->   |
|                                | 127.0.0.1:6379'}, {u'to': u'127.0.0.1:49517', u'from': u'127.0.0.1:60830', u'formatted':   |
|                                | u'127.0.0.1:60830       ->       127.0.0.1:49517'}, {u'to': u'127.0.0.1:49517', u'from':   |
|                                | u'127.0.0.1:60832', u'formatted': u'127.0.0.1:60832       ->       127.0.0.1:49517'}],     |
|                                | 'LISTEN': [{u'to': None, u'from': u'0.0.0.0:48901', u'formatted': u'0.0.0.0:48901'},       |
|                                | {u'to': None, u'from': u'0.0.0.0:17010', u'formatted': u'0.0.0.0:17010'}, {u'to': None,    |
|                                | u'from': u'0.0.0.0:50100', u'formatted': u'0.0.0.0:50100'}, {u'to': None, u'from':         |
|                                | u'0.0.0.0:50101', u'formatted': u'0.0.0.0:50101'}]}                                        |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_5571_create_time        | 2017-10-05T16:40:32.030000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_5571_create_time_utc    | 2017-10-05T14:40:32.030000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+

#7

I have two server without scheduling, however, there seems to be a connection to Redis:

 +--------------------------------+--------------------------------------------------------------------------------------------+
|              Key               |                                           Value                                            |
+================================+============================================================================================+
| component_details              | {"created_user_host": "app-servicebus-tst2@SCL-VAC-OTA", "version": "Zato 2.0.8.rev-       |
|                                | 74bfd513", "component": "SERVER", "created_ts": "2017-09-27T18:32:00.935198"}              |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_full_path            | /opt/APPS/servicebus/tst2/zato/server1                                                     |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_host                 | SCL-VAC-OTA/SCL-VAC-OTA.local                                                              |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_running              | True                                                                                       |
+--------------------------------+--------------------------------------------------------------------------------------------+
| current_time                   | 2017-10-05T18:33:35.491932                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| current_time_utc               | 2017-10-05T16:33:35.491951                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_connections        | {'ESTABLISHED': [{u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:42895', u'formatted':     |
|                                | u'127.0.0.1:42895       ->        127.0.0.1:5432'}], 'LISTEN': [{u'to': None, u'from':     |
|                                | u'127.0.0.1:8808', u'formatted': u'127.0.0.1:8808'}]}                                      |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_create_time        | 2017-10-02T17:51:50.290000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_create_time_utc    | 2017-10-02T15:51:50.290000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_name               | gunicorn:                                                                                  |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_pid                | 58877                                                                                      |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_username           | app-servicebus-tst2                                                                        |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_workers_no         | 2                                                                                          |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_workers_pids       | [58921, 64095]                                                                             |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_58921_connections       | {'ESTABLISHED': [{u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:44209', u'formatted':     |
|                                | u'127.0.0.1:44209       ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from':    |
|                                | u'127.0.0.1:42895', u'formatted': u'127.0.0.1:42895       ->        127.0.0.1:5432'},      |
|                                | {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:56667', u'formatted': u'127.0.0.1:56667    |
|                                | ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:44061',        |
|                                | u'formatted': u'127.0.0.1:44061       ->        127.0.0.1:5432'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:50269', u'formatted': u'127.0.0.1:50269       ->   |
|                                | 127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:49105', u'formatted':    |
|                                | u'127.0.0.1:49105       ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from':    |
|                                | u'127.0.0.1:51607', u'formatted': u'127.0.0.1:51607       ->        127.0.0.1:6383'},      |
|                                | {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:35731', u'formatted': u'127.0.0.1:35731    |
|                                | ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:37073',        |
|                                | u'formatted': u'127.0.0.1:37073       ->        127.0.0.1:6383'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:51604', u'formatted': u'127.0.0.1:51604       ->   |
|                                | 127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:54394', u'formatted':    |
|                                | u'127.0.0.1:54394       ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from':    |
|                                | u'127.0.0.1:49317', u'formatted': u'127.0.0.1:49317       ->        127.0.0.1:6383'},      |
|                                | {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:53110', u'formatted': u'127.0.0.1:53110    |
|                                | ->        127.0.0.1:6383'}], 'LISTEN': [{u'to': None, u'from': u'127.0.0.1:8808',          |
|                                | u'formatted': u'127.0.0.1:8808'}]}                                                         |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_58921_create_time       | 2017-10-02T17:51:52.810000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_58921_create_time_utc   | 2017-10-02T15:51:52.810000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_64095_connections       | {'ESTABLISHED': [{u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:47693', u'formatted':     |
|                                | u'127.0.0.1:47693       ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from':    |
|                                | u'127.0.0.1:42895', u'formatted': u'127.0.0.1:42895       ->        127.0.0.1:5432'},      |
|                                | {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:47564', u'formatted': u'127.0.0.1:47564    |
|                                | ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:50745',        |
|                                | u'formatted': u'127.0.0.1:50745       ->        127.0.0.1:5432'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:40500', u'formatted': u'127.0.0.1:40500       ->   |
|                                | 127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:39818', u'formatted':    |
|                                | u'127.0.0.1:39818       ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from':    |
|                                | u'127.0.0.1:50489', u'formatted': u'127.0.0.1:50489       ->        127.0.0.1:6383'},      |
|                                | {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:43518', u'formatted': u'127.0.0.1:43518    |
|                                | ->        127.0.0.1:6383'}], 'LISTEN': [{u'to': None, u'from': u'127.0.0.1:8808',          |
|                                | u'formatted': u'127.0.0.1:8808'}]}                                                         |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_64095_create_time       | 2017-10-04T00:52:09.880000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_64095_create_time_utc   | 2017-10-03T22:52:09.880000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+

Redis is running on port 6383 on this server.

Last lines from singleton.log for server1 and server2:

2017-10-04 00:46:47,934 - INFO - 58921:DummyThread-3935 - zato_singleton:22 - Cluster-wide singleton keep-alive OK, server id:[1], name:[server1]

2017-10-04 00:42:47,933 - INFO - 58043:DummyThread-2618 - zato_singleton:22 - Cluster-wide singleton keep-alive OK, server id:[1], name:[server1] 

Regards, Jan


#8

Thanks - but this is output for one server only, is that correct?

Is that server1 mentioned in singleton.log?


#9

Yes, this is the output for server2:

+--------------------------------+--------------------------------------------------------------------------------------------+
|              Key               |                                           Value                                            |
+================================+============================================================================================+
| component_details              | {"created_user_host": "app-servicebus-tst2@SCL-VAC-OTA", "version": "Zato 2.0.8.rev-       |
|                                | 74bfd513", "component": "SERVER", "created_ts": "2017-09-27T18:32:00.962612"}              |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_full_path            | /opt/APPS/servicebus/tst2/zato/server2                                                     |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_host                 | SCL-VAC-OTA/SCL-VAC-OTA.local                                                              |
+--------------------------------+--------------------------------------------------------------------------------------------+
| component_running              | True                                                                                       |
+--------------------------------+--------------------------------------------------------------------------------------------+
| current_time                   | 2017-10-05T20:21:33.084574                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| current_time_utc               | 2017-10-05T18:21:33.084593                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_connections        | {'ESTABLISHED': [{u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:42896', u'formatted':     |
|                                | u'127.0.0.1:42896       ->        127.0.0.1:5432'}], 'LISTEN': [{u'to': None, u'from':     |
|                                | u'127.0.0.1:8809', u'formatted': u'127.0.0.1:8809'}]}                                      |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_create_time        | 2017-10-02T17:51:50.290000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_create_time_utc    | 2017-10-02T15:51:50.290000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_name               | gunicorn:                                                                                  |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_pid                | 58878                                                                                      |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_username           | app-servicebus-tst2                                                                        |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_workers_no         | 2                                                                                          |
+--------------------------------+--------------------------------------------------------------------------------------------+
| master_proc_workers_pids       | [41140, 58043]                                                                             |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_41140_connections       | {'ESTABLISHED': [{u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:58314', u'formatted':     |
|                                | u'127.0.0.1:58314       ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from':    |
|                                | u'127.0.0.1:58440', u'formatted': u'127.0.0.1:58440       ->        127.0.0.1:5432'},      |
|                                | {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:50702', u'formatted': u'127.0.0.1:50702    |
|                                | ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:42896',        |
|                                | u'formatted': u'127.0.0.1:42896       ->        127.0.0.1:5432'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:54535', u'formatted': u'127.0.0.1:54535       ->   |
|                                | 127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:32947', u'formatted':    |
|                                | u'127.0.0.1:32947       ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from':    |
|                                | u'127.0.0.1:51296', u'formatted': u'127.0.0.1:51296       ->        127.0.0.1:6383'},      |
|                                | {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:43454', u'formatted': u'127.0.0.1:43454    |
|                                | ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:50444',        |
|                                | u'formatted': u'127.0.0.1:50444       ->        127.0.0.1:6383'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:44015', u'formatted': u'127.0.0.1:44015       ->   |
|                                | 127.0.0.1:6383'}], 'LISTEN': [{u'to': None, u'from': u'127.0.0.1:8809', u'formatted':      |
|                                | u'127.0.0.1:8809'}]}                                                                       |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_41140_create_time       | 2017-10-03T16:47:36.240000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_41140_create_time_utc   | 2017-10-03T14:47:36.240000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_58043_connections       | {'ESTABLISHED': [{u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:51143', u'formatted':     |
|                                | u'127.0.0.1:51143       ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from':    |
|                                | u'127.0.0.1:56754', u'formatted': u'127.0.0.1:56754       ->        127.0.0.1:5432'},      |
|                                | {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:42896', u'formatted': u'127.0.0.1:42896    |
|                                | ->        127.0.0.1:5432'}, {u'to': u'127.0.0.1:5432', u'from': u'127.0.0.1:49229',        |
|                                | u'formatted': u'127.0.0.1:49229       ->        127.0.0.1:5432'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:50218', u'formatted': u'127.0.0.1:50218       ->   |
|                                | 127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:46429', u'formatted':    |
|                                | u'127.0.0.1:46429       ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from':    |
|                                | u'127.0.0.1:44217', u'formatted': u'127.0.0.1:44217       ->        127.0.0.1:6383'},      |
|                                | {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:37107', u'formatted': u'127.0.0.1:37107    |
|                                | ->        127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:38285',        |
|                                | u'formatted': u'127.0.0.1:38285       ->        127.0.0.1:6383'}, {u'to':                  |
|                                | u'127.0.0.1:6383', u'from': u'127.0.0.1:33428', u'formatted': u'127.0.0.1:33428       ->   |
|                                | 127.0.0.1:6383'}, {u'to': u'127.0.0.1:6383', u'from': u'127.0.0.1:35969', u'formatted':    |
|                                | u'127.0.0.1:35969       ->        127.0.0.1:6383'}], 'LISTEN': [{u'to': None, u'from':     |
|                                | u'127.0.0.1:8809', u'formatted': u'127.0.0.1:8809'}]}                                      |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_58043_create_time       | 2017-10-02T21:04:44.030000                                                                 |
+--------------------------------+--------------------------------------------------------------------------------------------+
| worker_58043_create_time_utc   | 2017-10-02T19:04:44.030000+00:00                                                           |
+--------------------------------+--------------------------------------------------------------------------------------------+