(Migrated) Connection packs (idea)

(This message has been automatically imported from the retired mailing list)

Hi there,

following up on a recent IRC discussion regarding adding to Zato
connectors to external systems using the latter’s own API, I’d like to
discuss an idea I’ve been brewing for a moment.

Basically, it is about creating JSON documents to describe a given
domain system’s API, have this API uploaded to Zato in order for Zato to
automatically generate gluing services responsible for mapping user
input to an external system’s format. Such services would then invoke
the system and convert the response back to the called.

User code could then invoke these services using business objects only
(in Python terms this would simply anything that is dict or dict-like),
without thinking about any particularities of the underlying transport
or data format. This is already possible in Zato to a great level but
let’s push this idea even more.

The idea of describing interfaces surely rings a bell, it’s almost like
WSDLs in other parts of the server world, but here I am deliberately
thinking of scaling it down to a way simpler approach, one that would
deal with APIs only, no actions, includes, port types, endpoints -> this
Zato already has and it is known in its lingo under different names
(channels, outgoing connections, data_format etc.)

This wouldn’t also do any validation itself (but it would be possible
still, as explained later).

No validation is I think the only core thing /really/ making it
something different than WSDLs (apart from not being heavy XML people
generally look for ways to avoid, of course). Basically, this idea is
about mapping business objects (dicts) to external format, not about
validating anything. This would allow one to work with dicts or Bunches
(a very useful dict subclass), as it is customary to do in Python.

This could also deal with both JSON and more REST-ish (parameters in
URL) external systems. (XML/SOAP requests would be only slightly less
easier).

For instance, have a look at this Twitter API call:

https://dev.twitter.com/docs/api/1.1/post/friendships/update

Example address to invoke is:

https://api.twitter.com/1/statuses/show.json?id=112652479837110273&include_entities=true

The idea now is to create a JSON file along the lines of:

{‘conn_pack’: ‘Twitter’:
{
‘name’: ‘friendships_update’,
‘api_form’: ‘REST’,
‘method’: ‘POST’
}
}

out of which some sort of a glue service would be generated so that
users in their services would do

from zato.server.service import Service

class MyService(Service):
def handle(self):

     api_name = 'friendships_update'
     params = {'user_id':1, 'device':True}

     resp = self.outpack.get('Twitter').invoke(api_name, params)

     # resp is now a dict or a dict-like object

Naturally, not everything will be so flat as Twitter so another example
of using an API with more nested structures is

{‘conn_pack’: ‘MySystem’:
{
‘name’: ‘update’,
‘api_form’: ‘JSON’
‘method’: ‘POST’
}
}

Bunch https://pypi.python.org/pypi/bunch

from bunch import Bunch

Zato

from zato.server.service import Service

class MyService(Service):
def handle(self):

     customer = Bunch()
     customer.id = 123
     customer.name = 'Alice Grain'
     customer.segment = 'AKZ'

     product1 = Bunch()
     product1.type_id = 1
     product1.created = '2008-12-25'

     product2 = Bunch()
     product2.type_id = 2
     product2.created = '2013-05-14'

     customer.products = [product1, product2]

     self.outpack.get('MySystem').invoke('update', customer)

Some common attributes could be moved a level higher - so if each or
almost each of the API calls is of the same type (JSON, REST or SOAP),
it won’t make sense to specify it all the time for each call.

I won’t show it here but using XML/SOAP would look like almost exactly
the same - the only difference would be that you’d need to use
lxml.objectify instead of dict-like things. This would be similar to
what is already possible with Zato like here
https://zato.io/docs/progguide/xml.html#accessing-request-elements

Another thing is hooks. Like in other parts of Zato
(https://zato.io/docs/progguide/service-hooks.html) people will surely
need things beyond we will come up here so there must be a way to
customize things.

Hence, there should be at least 2 hooks for users to optionally
implement somewhere that would let them control additionally how
requests and responses are treated:

  • before_request - gets user input before it’s passed along to the
    external system and does anything is needed, for instance, uses XSD to
    validate the input before it leaves Zato

  • after_response - gets response from an external system and again, does
    what is needed, this could be used to plug validation it manually

Hooks are also where Zato could with time use XSD, RelaxNG or something
else automatically (but let’s not do it now).

The whole thing should also take API versioning into account. I haven’t
thought about it yet but I’m just signalling a need.

There are 2 things I think worth explaining still.

Why am I insisting on no XSD for now? Because the bigger systems to
integrate the less useful it is.

XSD sounds and works quite OK first but when you deal with thousands of
business objects (classes) all grouped into multi-level hierarchies,
almost inevitably everything in an XSD will be optional.

This is simply how it looks like.

The only usefulness of an XSD (at least as far as validation goes) is
that it checks the order of elements provided. But frankly, as longs as
you use Python instead of some custom-built XML parser, it’s really of
no concern.

In lower level languages XSD can also be used to generate all sorts of
useful things that let you deal with business classes in your code
instead of dealing with XML. This is cool and very handy indeed. But
this is Python - we can use business objects out of the box just like
that, as in the examples above.

And I’m not saying there will never be XSD/RelaxNG/anything - I’m just
saying it’s not a priority. If anyone truly needs it -> there are hooks
for that.

The other thing is that I /really/ like it to be something that
generates Python classes because there must be a way for users to
override anything that is automatically-generated. This classes must
also be not overly smart. Not too much magic, no metaclasses probably.
It must not look like SWIG, this must be something that is very easy to
customize manually because we won’t think of everything and we won’t
cover every need possible, hence it is 100% sure that people will edit
it sometimes themselves.

I don’t know exactly how to achieve it all but I think 95% of stuff
needed to complete it is already in Zato, it just needs to be nicely
connected now to provide a new value in form of that feature.

With time we can go further. Let’s add a ‘zato connpack’ CLI command to
https://zato.io/docs/admin/cli/index.html, such as

  • zato connpack get zimbra
  • zato connpack upgrade twitter
  • zato connpack install salesforce 3.1
  • zato connpack search fabecook -> (no ‘fabecook’ found, did you mean
    ’facebook’?)

etc. etc.

We can next create a central repository of user-provided connpacks to
let everyone collaborate on useful packs they themselves need. I’m
entirely sure very quickly people would be uploading packs to systems we
didn’t even know they existed. An ecosystem of APIs - that would be
something to look at.

Using Salesforce as an example, the only thing that is needed from Zato
now is a means to describe its Salesforce API within, say, 1 week of work.

If it’s made possible to express APIs of Salesforce’s size, people will
be able to add connpacks for smaller systems in one evening. Which will
be awesome :slight_smile:

What do you think?

On 05/31/2013 11:39 AM, Zato community’s mailing list wrote:

And this is not only about Connection Packs. General
Service Packs can have same distribution pattern.

I’ll reply to the other things in a moment but this one is a bit
different story and actually, it’s already possible to do what you
describe [1].

[1] is a list of service sources to install services from. A source can
be one of:

  • name of a Python class (which somehow needs to be put on PYTHONPATH,
    using zato_extra_paths for instance)
  • path to a concrete Python module in FS
  • path to a directory of modules, all of which will be deployed on a server

You can keep all your services in a repo of your choice and add path to
a checkout in service-sources.txt (and you can also override this name
if you don’t like it [2])

[1]
https://zato.io/docs/admin/guide/installing-services.html#service-sources-txt

[2]
https://zato.io/docs/admin/guide/install-config/config-server.html#admin-guide-config-server-main-service-sources

So if you’re worried that you are /required/ to hot-deploy everything
then no, there’s no such requirement. You can do it but you can skip it
entirely too.

Hi,

it is great that you’d summarized ideas we’d discussed on IRC as Connection
Packs.

What is not yet exposed here is authentication. When working with external
system’s APIs it is essential to be able to use different authentication
credentials (not only credentials of single Zato Bot) to be able to
impersonate as various users in these external systems.For instance with
SalesForce API and single account auth (like current OutConn do) all
actions performed are recorded as if performed by bearer of that license
that is not always desirable.

Another point is that your Twitter example didn’t list the URL in the JSON
below (i’ve just quotes it). Update it please for extra clarity:

{‘conn_pack’: ‘Twitter’:
{
‘name’: ‘friendships_update’,
‘api_form’: ‘REST’,
‘method’: ‘POST’
}
}

Regarding distribution of Connection Packs… What are the reasons for the
CLI connpack command? Is there reason for anything beyond Python packages
for distribution? For simpler approach, can we just drop tar/zip/egg in
pickup-dir? I’d like all JSON parsing and Service generation magic to
happen during that module/package import. This would even let having
single-py file Connection Pack (with JSON you mentioned inside that .py
file).

I’d favor building my Zato clusters with Buildout and would have no problem
in including the connection pack in the configuration, rerunning the
Buildout and initiate rolling restart of Zato servers. This feels “solid”.
Pickup-dir is another option (very useful during quickstart, or
development). And this is not only about Connection Packs. General Service
Packs can have same distribution pattern.

Publishing Connection Pack on PyPI is also extra advertizing. There are at
least 200 popular services offering API at the moment. If we have 200
packages on PyPI that’d be a lot of publicity :wink:

The “Install” action of connpack can potentially be implemented as just
another service called salesforce.install in pack, and even without extra
UI one can search for it on Web UI. If installation requires some
parameters, they can be specified during invocation. Install service of
pack could show sample parameters in its output if invoked without
parameters. The possible admin workflow when installing the pack would be:

  1. Drop SalesForce pack in the pickup-dir
  2. Invoke salesforce.install
    • Search for salesforce.install in Web UI and invoke
    • or with CLI “zato service invoke salesforce.api”
    1. get sample parameters in output
  3. Copy and paste them in parameters, edit as needed
  4. Invoke install again and have all Redis/OutConn/Auth structures
    configured.

Sorry for slightly offtopic post.

Thoughts?

m.

What is not yet exposed here is authentication. When working with
external system’s APIs it is essential to be able to use different
authentication credentials (not only credentials of single Zato Bot) to
be able to impersonate as various users in these external systems.For
instance with SalesForce API and single account auth (like current
OutConn do) all actions performed are recorded as if performed by bearer
of that license that is not always desirable.

Yea, auth is another thing to consider, good point.

About this particular use case. Zato already has transport-level
security and that would stay in connpacks. I understand you’re rather
thinking of business-level security?

So for instance, this imaginary request to do something with some parts:

{
‘part_id’: 123,
‘action’: ‘cancel_production’,
‘api_client’: ‘HZBAZA’
}

would still use, say, HTTP Basic Auth for authentication but
additionally would provide this particular external system’s
authorization key (here - ‘api_client’)

I’m just not familiar with Salesforce at all, I was only using it as an
example of a non-trivial API, so I’m not sure if this is the feature needed?

Another point is that your Twitter example didn’t list the URL in the
JSON below (i’ve just quotes it). Update it please for extra clarity:

{'conn_pack': 'Twitter':
    {
    'name': 'friendships_update',
    'api_form': 'REST',
    'method': 'POST'
    }
}

Right, this should too be added, however it needs some careful approach.
I wouldn’t really like to turn the whole thing into WSDL in JSON -
nested JSON is only slightly more readable than nested XML. So I guess
it could be done in a separate file. In that case a connpack would be a
directory of JSON documents, one for connection details, one for API
services provided by a system.

Regarding distribution of Connection Packs… What are the reasons for
the CLI connpack command?

Sure, that would only be a wrapper command. You could still easily just
drop a connpack package somewhere (pickup dir).

Is there reason for anything beyond Python
packages for distribution? For simpler approach, can we just drop
tar/zip/egg in pickup-dir? I’d like all JSON parsing and Service
generation magic to happen during that module/package import. This would
even let having single-py file Connection Pack (with JSON you mentioned
inside that .py file).

Publishing Connection Pack on PyPI is also extra advertizing. There are
at least 200 popular services offering API at the moment. If we have 200
packages on PyPI that’d be a lot of publicity :wink:

Right, a single-file connpack (even if it’s a compressed directory) is
also my idea but I’m still thinking of keeping it in JSON, not in Python
so the key point would be “you don’t need to code, you just need to
declare it”. Such things are of course a well-known can of worms if done
hurriedly and incorrectly but we’re building a new system with Zato and
we can do it properly, at least by leaving a lot freedom to customize
the auto-generated code.

The “Install” action of connpack can potentially be implemented as just
another service called salesforce.install in pack, and even without
extra UI one can search for it on Web UI. If installation requires some
parameters, they can be specified during invocation. Install service of
pack could show sample parameters in its output if invoked without
parameters. The possible admin workflow when installing the pack would be:

  1. Drop SalesForce pack in the pickup-dir
  2. Invoke salesforce.install
    • Search for salesforce.install in Web UI and invoke
    • or with CLI “zato service invoke salesforce.api”
  3. get sample parameters in output
  4. Copy and paste them in parameters, edit as needed
  5. Invoke install again and have all Redis/OutConn/Auth structures
    configured.

Sorry for slightly offtopic post.

Heh, this is perfectly on-topic :slight_smile:

It’s only that I thought that merely dropping the package into pickup
dir would do all the magic. So if someone prepares a package you’re
downloading it, you just download it off their site (or central repo)
and copy it into pickup dir. That would be all that you would need to do.

That should be really powerful -> you don’t need to code, you don’t need
to configure anything either.

But another idea, a related one I think, is - why not do it all from the
web admin once we have a central connpack repo? It would be really nice
to be able to browse and install the packages right from the net
directly in the browser!

Hi,

Here are some comments:

About this particular use case. Zato already has transport-level security

and that would stay in connpacks. I understand you’re rather thinking of
business-level security?

So for instance, this imaginary request to do something with some parts:

{
‘part_id’: 123,
‘action’: ‘cancel_production’,
‘api_client’: ‘HZBAZA’
}

would still use, say, HTTP Basic Auth for authentication but additionally
would provide this particular external system’s authorization key (here -
‘api_client’)

The auth is quite specific thing for each of the APIs. You mentioned
2-level AUTH, I envision this possible, but my particular usecase would
just need different HTTP Basic Auth for different requests. Some APIs
require even more diverse Auth approaches (like different header, or
specific Authentication: header value). I wouldn’t fix that “in stone”.

Regarding the “nested JSON”. Don’t be scared of it. And if one needs
cleaner representation yaml is one of them.

The JSON-only APIs can be accompanied by “template” python files, that
noone has to touch. It can even become PasteScript scaffold/template, or
virtualenvwrapper project template to kickstart the API connector
development.

These central repos are a bit difficult to do properly, require a lot of
coding. But if there would be some “central Zato for repository” and let
all Web UIs talk to it like this UI talks to Zato clusters it manages, why
not? The connector would be downloaded to browser of Administrator, and
uploaded to cluster. Not sure how big such payloads can be but with HTML5
it should be possible even for tar/zip/gz/egg files. PyPI as repository and
"Zato Pack" classifier can be enough.