Path precedence for http rest channels in Zato 3


#1

Hi,

I am having problems with precedence of channels, I have the following path definitions:

 -
    name: Z3 eval
    service: z3eval.z3-eval
    url_path: /eval/{eval_id}

-
    name: Z3 eval, sub 1
    service: z3eval.z3-eval-sub1
    url_path: /eval/{eval_id}/sub1

When I send this http request to Zato: /eval/12/sub1

I get this error:

ZatoException: <ZatoException at 0x7fbe3c53db90 cid:`None`, msg:`Conversion error, param:`eval_id`, 
param_name:`eval_id`, repr:`u'12/sub1'`, type:`<type 'unicode'>`, e:`Traceback (most 
recent call last):
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/reqresp/sio.py", line 371, in convert_sio
    value = int(value)
ValueError: invalid literal for int() with base 10: '12/sub1'

This used to work in Zato 2.0.8.

This error occurs when I load channel definitions with enmasse. Sometimes it help if the channels definitions are deleted and manually re-created through the web-admin.

When the path /eval/{eval_id} is changed to /eval/{eval_id}/ it works.

Regards, Jan


#2

Are you 100% sure this used to work in 2.0.8 with channels of the same names? I understand that it would be Z2 rather than Z3 but their names are not irrelevant.

Zato has an internal cache of pre-compiled expressions that are used to match requests:

  • Each element between two slashes / is considered to be the name of an element
  • Channels are sorted in the cache by their names
  • First channel to match the incoming request will take precedence

This all implies:

  • If you have /eval/{eval_id}, because eval_id has only one slash, this means:

    • Starts with /eval/
    • Is followed up by any string, including slash character, whose value will be placed in eval_id
    • As it happens, eval_id is considered an integer ID by default (you can look it up in simple-io.conf) and '12/sub1' rightfully cannot be converted to one
  • Whereas, when you have /eval/{eval_id}/ it means:

    • Starts with /eval/
    • Is followed up by a string up to the next slash
    • Thus, /eval/12/sub1 gives 12
  • Now, if you have two channels that potentially may match the same request, the channel whose name is closer to the beginning of the cache will take precedence. In your case, this is:

>>> min('Z3 eval', 'Z3 eval, sub 1')
'Z3 eval'
>>> 

… which is why you are getting the exception.

The ways to deal with it would be:

  • Confirming if in your 2.0.8 channels that you did not have the same kind of situation

  • Making sure to name the channels accordingly if there is a room that more than one channel will match the same request path


#3

Tx for the clear explanation! Especially the remark about the importance of the name of the channel is very valuable!

The channels have the same naming in zato 2.0.8:

Until now, this just works. But, I have to check with the users of the api if the problematic channels (refresh and validations) are really being used. I will experiment with the naming of the channels, but now that I know that the naming is important, I can make this work.


#4

Thanks for the update - yes, the naming matters, otherwise we would need to require for channel paths to be always unique and non-overlapping, even if path patterns were used.

This can be enforced on a small scale but with dozens or hundreds of channels there will be always some conflicts sooner or later if path patterns are used so this is the way to resolve them, to hint to Zato which one has priority.


#5

I have checked in our Zato 2.0.8 instance. There this problem does not exist.

It is quite unfortunate that in the routing the ‘/’ char is not excluded as an identifier. Most webframework like Django, Flask and Pyramid do exclude the ‘/’ as part of an identifier (usually using regex for the routing).

For Restful interfaces, api’s like this are very common:

A: /api/resource/identifier
B: /api/resource/identifier/subresource1
C: /api/resource/identifier/subresource2

In the current situation, we now need to carefully choose names for the Zato channels, to avoid problems. B and C need to be named in such a way that the are matched earlier than A.

Would it be possible to make this behaviour (how identifiers are matched) configurable in some way?

Regards, Jan


#6

Yes, I am interested in resolving it - what you are saying make sense to me.

Can you give me an example of Django urls.py where the correct behaviour can be observed? I would like to examine the regexes used.

Thanks.


#7

Hi @dsuch

Django and Flask both use converters in the route definition:

So: /api/resource/<converter:identifier>

Where converter can be str, int, slug, uuid or path.
For Flask converters are similar: string, int, float, path and uuid.

Actually, I was not totally correct about Django. The current Django version does use the ‘str’ converter as default, and that converter accepts ‘/’. This is not the case for Flask.

About regexes, Pyramid uses the following regex as default: {foo:[^/]+}, but does not have the concept of named converters.

Maybe Zato could also introduce this concept of named converters. Although, there is some duplication there, beause the SimpleIO inner class also defines the input parameters and the type of the parameters. So, ideally, the routing should take its ‘converter’ type from the definitions in SimpleIO, if available. So, e.g. if SimpleIO defines and ‘object_id’, the implicit converter being used in the REST channel should be ‘\d+’.

Hope this helps,

Regards, Jan


#8

I see, it looks there are two angles to this question:

  • Special-casing / in URL path matching
  • Providing data types in URL path patterns

The former can be done, though it would require a flag to enable or disable it, I have seen REST channels with patterns such as /{path} and all the actual matching and dispatching took place in a special service.

The latter also can be done but I am not 100% clear about your remark regarding SimpleIO - do you mean that path parameters without an explicit data type should behave akin to SIO parameters and honour the same conventions, e.g. is_* -> bool, *_id -> int etc.?

Please open tickets for both of the features and I do not see why not to do it. Thanks.


#9

@dsuch Tx. for your willingness to implement something like this. I will add tickets to the github repo.

About SIO, what I meant is that during the route matching, associated service classes could be introspected for the parameters defined in the SimpleIO attributes to determine the type of parameter. But, I guess the impact on performance would be too big. I like your suggestion about using the conventions (_id etc.).

Regards, Jan


#10

Thanks for the ticket, @jjmurre.

Once such data type converters are implemented, they will not incur runtime overhead as far as introspection goes. As with other places in Zato, everything will be compiled/computed on the fly when a channel or service is deployed - there will be logic to match path elements with their underlying SimpleIO definitions (if any) and correct data types will be constructed in terms of regular expressions. Each time a channel or service changes, they will be rebuilt and re-cached.

This kind of details takes time to implement and this is why I cannot do it now but perhaps there will be time for it around the time that SimpleIO as such will be refactored and reimplemented (Cython, C++).

But certainly, making sure that / is treated specially is doable sooner. I will let you know in the ticket.