(Migrated) long running zato service

(This message has been automatically imported from the retired mailing list)

Hi all
I have one zato service that it’s not supposed to run very quickly because
I use it only for execute scheduled jobs to proccess information, this
service is not get completed executed and throw an error in the log:

2014-06-05 14:38:19,233 - CRITICAL - 30041:Dummy-1 - gunicorn.error:175 -
WORKER TIMEOUT (pid:30051)

I want to ask if there is a way in Zato to not use timeouts completely for
a particular service, or some points of directions to look at the code
where to made some modifications to accomplish this task

Cheers

My case is the following:

  • Scheduler invoke service1
  • service1 performs a lot operations
  • Summed up, the operations take longer than main.gunicorn_timeout
  • gunicorn restarts my worker

The service1 never get called from http or any other than scheduler job. I
can use the invoke_async method to let the another service do the same taks
than service1 if that resolve the problem, it’s ok for me, I only need to
add another thin layer over my service job execution, no problem
I just though that job services could take as much time as they need, or at
least they shouldn’t?
Can you confirm that invoke_async don’t suffers from timeouts?
many thanks

On Fri, Jun 6, 2014 at 5:31 PM, Dariusz Suchojad dsuch@zato.io wrote:

On 06/06/2014 08:11 PM, Axel Mendoza Pupo wrote:

2014-06-05 14:38:19,233 - CRITICAL - 30041:Dummy-1 - gunicorn.error:175 -
WORKER TIMEOUT (pid:30051)

I want to ask if there is a way in Zato to not use timeouts completely
for
a particular service, or some points of directions to look at the code
where to made some modifications to accomplish this task

Hi Axel,

the timeout you come across is set by gunicorn - it just uses the
timeout to make sure there are no rogue processes/requests consuming all
the worker processes available.

From Zato’s end it can be changed in server.conf

https://zato.io/docs/admin/guide/install-config/config-server.html#main-gunicorn-timeout

There is no way to change it on a per-service basis and adding such a
feature would mean changing a lot of gunicorn’s internals, I believe.

That said, can you simply split your functionality into two services?

I understand your situation is now:

  • HTTP request to service1
  • service1 performs a lot operations
  • Summed up, the operations take longer than main.gunicorn_timeout
  • gunicorn restarts your worker

Can you change it to:

  • HTTP request to service1
  • service1 uses .invoke_async to invoke service2 in background
  • At that point service1 terminates so gunicorn considers the situation
    is correct
  • service2 performs a lot of operations service1 used to
  • But now service2 wasn’t invoked through gunicorn so service2 is free
    to take as much time as needed

https://zato.io/docs/progguide/service-dev.html#invoke-async

Would that work for you?

I’m talking about sql queries operations, in a test example that a client
upload, it have 25500 lines, and for every line I need to execute the same
query to obtain the results and when all the queries are executed I create
a zipped file and uploaded to an FTP location and send a notification by
mail, can be millions of lines, that’s why I use the scheduler to do the
work offline at a configured time.
Another thing here is that I’m storing the job config in the extra field, I
notice that this may be a problem when listing it from web admin(I don’t
use it for production) because they all are write to the logs and they are
big then, not a problem but I don’t know right now in what other part of
Zato the entire job definitions are dumped to the logs, all of my expected
extra data will be huge amount of lines to be processed

On Fri, Jun 6, 2014 at 5:54 PM, Dariusz Suchojad dsuch@zato.io wrote:

On 06/07/2014 12:50 AM, Axel Mendoza Pupo wrote:

  • Scheduler invoke service1
  • service1 performs a lot operations

Ok - but what are those operations? What does service1 actually do? What
does it invoke/use/connect to? Also, when you say ‘a lot’, what does it
mean exactly? How many operations a second there are?


Dariusz Suchojad

https://zato.io
ESB, SOA, REST, APIs and cloud integrations in Python

On 06/06/2014 08:11 PM, Axel Mendoza Pupo wrote:

2014-06-05 14:38:19,233 - CRITICAL - 30041:Dummy-1 - gunicorn.error:175 -
WORKER TIMEOUT (pid:30051)

I want to ask if there is a way in Zato to not use timeouts completely for
a particular service, or some points of directions to look at the code
where to made some modifications to accomplish this task

Hi Axel,

the timeout you come across is set by gunicorn - it just uses the
timeout to make sure there are no rogue processes/requests consuming all
the worker processes available.

From Zato’s end it can be changed in server.conf

https://zato.io/docs/admin/guide/install-config/config-server.html#main-gunicorn-timeout

There is no way to change it on a per-service basis and adding such a
feature would mean changing a lot of gunicorn’s internals, I believe.

That said, can you simply split your functionality into two services?

I understand your situation is now:

  • HTTP request to service1
  • service1 performs a lot operations
  • Summed up, the operations take longer than main.gunicorn_timeout
  • gunicorn restarts your worker

Can you change it to:

  • HTTP request to service1
  • service1 uses .invoke_async to invoke service2 in background
  • At that point service1 terminates so gunicorn considers the situation
    is correct
  • service2 performs a lot of operations service1 used to
  • But now service2 wasn’t invoked through gunicorn so service2 is free
    to take as much time as needed

https://zato.io/docs/progguide/service-dev.html#invoke-async

Would that work for you?

On 06/06/2014 08:11 PM, Axel Mendoza Pupo wrote:

2014-06-05 14:38:19,233 - CRITICAL - 30041:Dummy-1 - gunicorn.error:175 -
WORKER TIMEOUT (pid:30051)

I want to ask if there is a way in Zato to not use timeouts completely for
a particular service, or some points of directions to look at the code
where to made some modifications to accomplish this task

Hi Axel,

the timeout you come across is set by gunicorn - it just uses the
timeout to make sure there are no rogue processes/requests consuming all
the worker processes available.