Zato web-admin not accessible


#1

Hello,

Has anyone run into an issue of Zato web-admin becoming inaccessible suddenly?
We have the following setup:

  • CentOS 7
  • Zato 2.0.8.rev-050c6697
  • PostgreSQL 9.4

Everything starts up fine and is ok for a day or so after which the web-admin url on port 8183 becomes inaccessible.

The process seems to be running though:

$ netstat -anp |grep 8183
tcp 11 0 0.0.0.0:8183 0.0.0.0:* LISTEN 23241/python

$ ps -ef |grep 23241
zato 23241 1 0 Nov22 ? 00:00:28 /opt/zato/2.0.8/code/bin/python /opt/zato/2.0.8/code/bin/py -m zato.admin.main

we see the following connections in netstat for port 8183:

  • 10 connections in CLOSE_WAIT
  • 2 ESTABLISHED
  • 3 SYN_RECV
  • 1 LISTEN.

The only way to solve this is to restart the web-admin and everything resumes back to normal, but this has been occurring on a daily basis and there are no logs inside web-admin.log which would help us determine the root cause.

Is there a way we can investigate whats happening or going wrong?

Cheers


#2

Hi @nikhil,

web-admin is typically very rarely restarted and it runs for months or years even if servers are restarted from time to time and I have never seen such a situation before.

How do you exactly confirm it that web-admin is not accessible? When this situation happens, can you run the command below to check output, unless you are doing it already?

$ telnet 127.0.0.1 8183
GET /zato HTTP/1.1

The point is to invoke web-admin over local TCP, thus confirming if it replies if accessed directly.


#3

Hello @dsuch ,

This actually is still happening and we’ve tried accessing this locally and it just times out.

When web-admin is in hung state

$ wget http://localhost:8183/zato
–2017-11-27 03:20:10-- http://localhost:8183/zato
Resolving localhost (localhost)… ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:8183… failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:8183… connected.
HTTP request sent, awaiting response…
^C

After restarting web-admin

$ wget http://localhost:8183/zato
–2017-11-27 03:25:03-- http://localhost:8183/zato
Resolving localhost (localhost)… ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:8183… failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:8183… connected.
HTTP request sent, awaiting response… 301 Moved Permanently
Location: /zato/ [following]
–2017-11-27 03:25:03-- http://localhost:8183/zato/
Connecting to localhost (localhost)|127.0.0.1|:8183… connected.
HTTP request sent, awaiting response… 302 Found
Location: /accounts/login/?next=/zato/ [following]
–2017-11-27 03:25:03-- http://localhost:8183/accounts/login/?next=/zato/
Connecting to localhost (localhost)|127.0.0.1|:8183… connected.
HTTP request sent, awaiting response… 200 OK
Length: unspecified [text/html]
Saving to: ‘zato’

[ <=>                                                                                                ] 4,849       --.-K/s   in 0s      

2017-11-27 03:25:03 (622 MB/s) - ‘zato’ saved [4849]

Cheers


#4

Hi @nikhil,

it looks that the process itself is available but it is blocked on something. All these connections that it has opened, can you check what they are about?

Except for sessions from browsers and a connection to ODB, there should not be any long-running TCP connections.

There is a tool that I am using to check output from processes running in background, it’s called ispy.

It’s not ideal in all situations because it’s not always the case that a process will log to stdout - in case it doesn’t show anything, can you run strace for that web-admin’s PID? There is an example of how to do it in screenshots for ispy.

The easiest way to install ispy is to use the pex installer and then chmod u+x ispy.