KeyError when querying hybrid properties using SQLAlchemy


#1

Hey there!

I was testing my service room.list and found this error on the logs:

2018-11-15 12:08:43,649 - WARNING - 136:DummyThread-65 - zato.server.service:502 - Traceback (most recent call last):
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 477, in update_handle
    self._invoke(service, channel)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 399, in _invoke
    service.handle()
  File "/opt/zato/env/qs-1/server2/work/hot-deploy/current/room.py", line 266, in handle
    query = query.add_columns(Cols[c])
  File "/opt/zato/3.0/code/local/lib/python2.7/site-packages/sqlalchemy/util/_collections.py", line 194, in __getitem__
    return self._data[key]
KeyError: u'accommodates'


2018-11-15 12:08:43,651 DEBG 'zato-server2' stdout output:
2018-11-15 12:08:43,650 - ERROR - 136:DummyThread-65 - zato.server.connection.http_soap.channel:324 - Caught an exception, cid:`b302d6ac64bdbe4708a2cdb2`, status_code:`500`, _format_exc:`Traceback (most recent call last):
  File "/opt/zato/3.0/code/zato-server/src/zato/server/connection/http_soap/channel.py", line 268, in dispatch
    payload, worker_store, self.simple_io_config, post_data, path_info, soap_action)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/connection/http_soap/channel.py", line 502, in handle
    params_priority=channel_item.params_pri)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 477, in update_handle
    self._invoke(service, channel)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 399, in _invoke
    service.handle()
  File "/opt/zato/env/qs-1/server2/work/hot-deploy/current/room.py", line 266, in handle
    query = query.add_columns(Cols[c])
  File "/opt/zato/3.0/code/local/lib/python2.7/site-packages/sqlalchemy/util/_collections.py", line 194, in __getitem__
    return self._data[key]
KeyError: u'accommodates'

So the first thing that came into my mind was this thread about hybrid properties and queries.

By checking the modifications made on the code, I realised that, by the time of the abovementioned discussion, I was querying like this:

session.query(Room).filter(and_(Room.id == id_, Room.deleted.is_(None))).one_or_none()

Whereas now, in the room.list service, I am querying by adding all columns manually (due to the fields projection feature, in which one is allowed to select which fields to be returned by using the fields query string parameter:

fields_allowed = ('id', 'floor_no', 'room_no', 'name', 'sgl_beds',
                      'dbl_beds', 'supplement', 'code', 'accommodates',
                      'number', 'deleted')
query = session.query(func.count().over().label('count'))
if not columns:
    columns = self.fields_allowed
for c in columns:
    query = query.add_columns(Room.__table__.columns)

output_optional in the SimpleIO class is as follows:

output_optional = ('id', 'floor_no', 'room_no', 'sgl_beds', 'dbl_beds',
                   'supplement', 'code', 'name', 'accommodates', 'number', 'count')

So the first thing that came into my mind was that I needed to add expressions to the Room model class. Unfortunately, that didn’t turn out very well (partially, and probably, due to my lack of hability or capacity):

from sqlalchemy import func
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.dialects import postgresql

    @hybrid_property
    def accommodates(self):
        return self.sgl_beds + self.dbl_beds * 2

    @accommodates.expression
    def accommodates(cls):
        return (cls.sgl_beds + cls.dbl_beds * 2)

    @hybrid_property
    def number(self):
        return '%d%02d' % (self.floor_no, self.room_no)

    @number.expression
    def number(cls):
        return (func.cast(cls.floor_no, String) +
                func.lpad(func.cast(cls.room_no, postgresql.TEXT), 2, '0'))

The error was still the same:

2018-11-16 19:15:25,165 - WARNING - 137:DummyThread-24 - zato.server.service:502 - Traceback (most recent call last):
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 477, in update_handle
    self._invoke(service, channel)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 399, in _invoke
    service.handle()
  File "/opt/zato/env/qs-1/server1/work/hot-deploy/current/room.py", line 278, in handle
    query = query.add_columns(Cols[c])
  File "/opt/zato/3.0/code/local/lib/python2.7/site-packages/sqlalchemy/util/_collections.py", line 194, in __getitem__
    return self._data[key]
KeyError: u'number'


2018-11-16 19:15:25,167 DEBG 'zato-server1' stdout output:
2018-11-16 19:15:25,166 - ERROR - 137:DummyThread-24 - zato.server.connection.http_soap.channel:324 - Caught an exception, cid:`b5fe826016cc20e4e370c65d`, status_code:`500`, _format_exc:`Traceback (most recent call last):
  File "/opt/zato/3.0/code/zato-server/src/zato/server/connection/http_soap/channel.py", line 268, in dispatch
    payload, worker_store, self.simple_io_config, post_data, path_info, soap_action)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/connection/http_soap/channel.py", line 502, in handle
    params_priority=channel_item.params_pri)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 477, in update_handle
    self._invoke(service, channel)
  File "/opt/zato/3.0/code/zato-server/src/zato/server/service/__init__.py", line 399, in _invoke
    service.handle()
  File "/opt/zato/env/qs-1/server1/work/hot-deploy/current/room.py", line 278, in handle
    query = query.add_columns(Cols[c])
  File "/opt/zato/3.0/code/local/lib/python2.7/site-packages/sqlalchemy/util/_collections.py", line 194, in __getitem__
    return self._data[key]
KeyError: u'number'

So the next thing I did was to remove number and accomodates from the tuple fields_allowed:

fields_allowed = ('id', 'floor_no', 'room_no', 'name', 'sgl_beds',
                  'dbl_beds', 'supplement', 'code', 'deleted')

And executing the call (using curl) to http://127.0.0.1:11223/genesisng/rooms/list was now not failing, but the hybrid properties number and accommodates were not there:

$ curl -v -g "http://127.0.0.1:11223/genesisng/rooms/list"; echo ""
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 11223 (#0)
> GET /genesisng/rooms/list HTTP/1.1
> Host: 127.0.0.1:11223
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: Zato
< Date: Fri, 16 Nov 2018 19:19:41 GMT
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: application/json
< X-Zato-CID: 769adfe37f8c8da88bef6de2
< Cache-Control: no-cache
< 
{"response": [{"count": 6, "code": "pink", "sgl_beds": 2, "room_no": 1, "floor_no": 1, "supplement": 20.0, "dbl_beds": 0, "id": 1, "name": "Normal bedroom with two single beds"}, {"count": 6, "code": "black", "sgl_beds": 2, "room_no": 2, "floor_no": 1, "supplement": 40.0, "dbl_beds": 1, "id": 2, "name": "Large bedroom with two single and one double beds"}, {"count": 6, "code": "white", "sgl_beds": 3, "room_no": 3, "floor_no": 1, "supplement": 50.0, "dbl_beds": 1, "id": 3, "name": "Very large bedroom with three single and one double beds"}, {"count": 6, "code": "purple", "sgl_beds": 4, "room_no": 4, "floor_no": 1, "supplement": 40.0, "dbl_beds": 0, "id": 4, "name": "Very large bedroom with four single beds"}, {"count": 6, "code": "blue", "sgl_beds": 3, "room_no": 5, "floor_no": 1, "supplement": 30.0, "dbl_beds": 0, "id": 5, "name": "Large bedroom with three single beds"}, {"count": 6, "code": "brown", "sgl_beds": 0, "room_no": 6, "floor_no": 1, "supplement": 20.0, "dbl_beds": 1, "id": 6, "name": "Normal bedroo* Closing connection 0
m with one double bed"}]}

So the next thing I did was to modify the code so that it looked like the room.get service, that is:

  • Use session.query(Room).
  • Add no other attributes (no count, no number and no accommodates).

query = session.query(Room)

And it worked fine (as expected, as room.get does work). Number and accommodates are back in the list of results.

Then I did one more test and added just count, like this:

query = session.query(Room, func.count().over().label('count'))

And I got no error, but only count is present in the result:

$ curl -v -g "http://127.0.0.1:11223/genesisng/rooms/list"; echo ""
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 11223 (#0)
> GET /genesisng/rooms/list HTTP/1.1
> Host: 127.0.0.1:11223
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: Zato
< Date: Fri, 16 Nov 2018 19:26:27 GMT
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: application/json
< X-Zato-CID: 699dbf7b6132fb07119c676c
< Cache-Control: no-cache
< 
* Closing connection 0
{"response": [{"count": 6}, {"count": 6}, {"count": 6}, {"count": 6}, {"count": 6}, {"count": 6}]}

I checked the queries being executed and they are all the same each and everytime results are returned (despite the fact that some attributes are missing from the returned JSON). When there is an error on the logs, no SQL query appears on the database logs:

2018-11-16 20:26:27.696 CET [22732] genesisng@genesisng LOG:  execute pg8000_statement_137_12: SELECT room.id AS room_id, room.floor_no AS room_floor_no, room.room_no AS room_room_no, room.name AS room_name, room.sgl_beds AS room_sgl_beds, roo
m.dbl_beds AS room_dbl_beds, room.supplement AS room_supplement, room.code AS room_code, room.deleted AS room_deleted, count(*) OVER () AS count  
        FROM room ORDER BY room.id ASC  
         LIMIT $1 OFFSET $2 
2018-11-16 20:26:27.696 CET [22732] genesisng@genesisng DETAIL:  parameters: $1 = '20', $2 = '0'

So, the conclussion is that hybrid properties are only returned if the queried fields are exactly the ones in the model class.

At the moment I do not need to filter by number or accommodates, so I can/could take them out of the fields_allowed tuple and hope there is a fix as there was one for the hybrid properties to be returned as part of the JSON before.

Generally speaking, though, it would be a nice feature to have and, from the SQLAlchemy documentation, I take it that I’d require the abovementioned hybrid property expressions on the model class to work.

Worst case scenario, I would have to forget about hybrid properties and just have them as table columns being filled in by triggers or some other mechanism SQLAlchemy may provide.

So, ideally, from less to more:

  1. Fix (again) to allow hybrid properties to be returned as part of the JSON when other attributes, such as count, is being added to the query.
  2. Allow the possiblity to execute queries that use hybrid attributes, i.e. by having hybrid property expressions on the model class

Thanks in advance.

Addendum

Here you are the service room.list as it’s supposed to be running:

class List(Service):
    """Service class to get a list of all rooms in the system."""
    """Channel /genesisng/rooms/list."""

    model = Room
    criteria_allowed = ('id', 'number')
    direction_allowed = ('asc', 'desc')
    filters_allowed = ('id', 'floor_no', 'sgl_beds', 'dbl_beds', 'code')
    comparisons_allowed = ('lt', 'lte', 'eq', 'ne', 'gte', 'gt')
    operators_allowed = ('and', 'or')
    fields_allowed = ('id', 'floor_no', 'room_no', 'name', 'sgl_beds',
                      'dbl_beds', 'supplement', 'code', 'deleted',
                      'number', 'accommodates')
    search_allowed = ()

    class SimpleIO:
        input_optional = (List('page'), List('size'), List('sort'),
                          List('filters'), List('fields'), List('operator'),
                          List('search'))
        output_optional = ('id', 'floor_no', 'room_no', 'sgl_beds', 'dbl_beds',
                           'supplement', 'code', 'name', 'accommodates',
                           'number', 'count')
        skip_empty_keys = True
        output_repeated = True

    def handle(self):
        conn = self.user_config.genesisng.database.connection
        default_page_size = int(
            self.user_config.genesisng.pagination.default_page_size)
        max_page_size = int(
            self.user_config.genesisng.pagination.max_page_size)
        Cols = self.model.__table__.columns

        # TODO: Have these default values in user config?
        default_criteria = 'id'
        default_direction = 'asc'
        default_operator = 'and'

        # Page number
        try:
            page = int(self.request.input.page[0])
        except (ValueError, KeyError, IndexError):
            page = 1

        # Page size
        try:
            size = int(self.request.input.size[0])
        except (ValueError, KeyError, IndexError):
            size = default_page_size

        # Order by
        try:
            criteria, direction = self.request.input.sort[0].lower().split('|')
        except (ValueError, KeyError, IndexError, AttributeError):
            criteria = default_criteria
            direction = default_direction

        # Filters
        try:
            filters = self.request.input.filters
            operator = self.request.input.operator[0]
        except (ValueError, KeyError, IndexError):
            filters = []
            operator = default_operator

        # Fields projection
        try:
            fields = self.request.input.fields
        except (ValueError, KeyError):
            fields = []

        # Search
        try:
            search = self.request.input.search[0]
        except (ValueError, KeyError, IndexError):
            search = None

        # Check and adjust parameter values

        # Handle pagination
        page = 1 if page < 1 else page
        size = default_page_size if size < 1 else size
        size = default_page_size if size > max_page_size else size

        # Handle sorting
        if criteria not in self.criteria_allowed:
            criteria = default_criteria
        if direction not in self.direction_allowed:
            direction = default_direction

        # Handle filtering
        conditions = []
        for filter_ in filters:
            field, comparison, value = filter_.split('|')
            if field in self.filters_allowed and \
               comparison in self.comparisons_allowed:
                conditions.append((field, comparison, value))
        if operator not in self.operators_allowed:
            operator = default_operator

        # Handle fields projection
        columns = []
        for f in fields:
            if f in self.fields_allowed:
                columns.append(f)

        # Handle search
        if not self.search_allowed:
            search = None

        # Compose query
        with closing(self.outgoing.sql.get(conn).session()) as session:
            query = session.query(func.count().over().label('count'))

            # Add columns
            if not columns:
                columns = self.fields_allowed

            for c in columns:
                query = query.add_columns(Cols[c])

            # Prepare filters
            if conditions:
                clauses = []
                for c in conditions:
                    f, o, v = c
                    if o == 'lt':
                        clauses.append(Cols[f] < v)
                    elif o == 'lte':
                        clauses.append(Cols[f] <= v)
                    elif o == 'eq':
                        clauses.append(Cols[f] == v)
                    elif o == 'ne':
                        clauses.append(Cols[f] != v)
                    elif o == 'gte':
                        clauses.append(Cols[f] >= v)
                    elif o == 'gt':
                        clauses.append(Cols[f] > v)
                if operator == 'or':
                    query = query.filter(or_(*clauses))
                else:
                    query = query.filter(and_(*clauses))

            # Search
            if search:
                clauses = []
                for s in self.search_allowed:
                    clauses.append(Cols[s].ilike(search))
                query = query.filter(or_(*clauses))

            # Order by
            if direction == 'asc':
                query = query.order_by(Cols[criteria].asc())
            else:
                query = query.order_by(Cols[criteria].desc())

            # Calculate limit and offset
            limit = size
            offset = size * (page - 1)
            query = query.offset(offset)
            query = query.limit(limit)

            # Execute query
            result = query.all()

            # Return result
            if result:
                self.response.payload[:] = result
                self.response.status_code = OK
                self.response.headers['Cache-Control'] = 'no-cache'
            else:
                self.response.status_code = NO_CONTENT
                self.response.headers['Cache-Control'] = 'no-cache'

And here is the model class, with the two added attempts at hybrid propertiy expressions:

# coding: utf8
from __future__ import absolute_import, division
from __future__ import print_function, unicode_literals
from .base import Base
from sqlalchemy import Column, Integer, Float, String, DateTime, func
from sqlalchemy import UniqueConstraint, CheckConstraint
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.dialects import postgresql
import hashids


def generate_code(context):
    # TODO: Add a salt value?
    # hashids = Hashids(salt='this is my salt 1')
    # hashid = hashids.encode(value)
    return hashids.encode(context.current_parameters.get('floor_no'),
                          context.current_parameters.get('room_no'),
                          context.current_parameters.get('sgl_beds'),
                          context.current_parameters.get('dbl_beds'))


class Room(Base):
    __tablename__ = 'room'
    __rels__ = []
    __table_args__ = (
        # Combination of floor number and room number must be unique.
        UniqueConstraint('floor_no', 'room_no', name='room_floor_no_room_no'),
        # The sum of beds must be a positive integer.
        CheckConstraint('sgl_beds + dbl_beds > 0'),
    )

    # SQLAlchemy automatically creates the table column using the SERIAL type
    # which triggers the creation of a sequence automatically.
    id = Column(Integer, primary_key=True)
    floor_no = Column(Integer, nullable=False)
    room_no = Column(Integer, nullable=False)
    name = Column(String(100), index=True)
    sgl_beds = Column(Integer, nullable=False, default=0, index=True)
    dbl_beds = Column(Integer, nullable=False, default=0, index=True)
    supplement = Column(Float, nullable=False, default=0)
    code = Column(
        String(32),
        nullable=False,
        default=generate_code,
        unique=True,
        comment='Unique code used to link to images')
    deleted = Column(DateTime, default=None)

    def __repr__(self):
        return "<Room(id='%s', number='%s', accommodates='%s')>" % (
            self.id, self.number, self.accommodates)

    @hybrid_property
    def accommodates(self):
        return self.sgl_beds + self.dbl_beds * 2

    @accommodates.expression
    def accommodates(cls):
        return (cls.sgl_beds + cls.dbl_beds * 2)

    @hybrid_property
    def number(self):
        return '%d%02d' % (self.floor_no, self.room_no)

    @number.expression
    def number(cls):
        return (func.cast(cls.floor_no, String) +
                func.lpad(func.cast(cls.room_no, postgresql.TEXT), 2, '0'))

#2

I am not really familiar with hybrid properties in SQLAlchemy so it is only a conjecture when I say that you defined Cols to be a list/dict of columns in that model but SA does include the hybrid property “number” in that list/dict, possibly because, indeed, it is not a realm column.

By that, I mean that this is an exception when looking up a key in self.model.__columns__, which is in the SQLAlchemy’s realm, so you should be able to reproduce it standalone outside of Zato too, and then deal with it accordingly (but I do not know how).


#3

Hi, @dsuch.

I think I understand what you mean, but I have tried the case where Cols (a.k.a. Room.table.columns) is not used, and it’s still happening. Let’s go to a more simple case, the room.get service:

class Get(Service):
    """Service class to get a room by id."""
    """Channel /genesisng/rooms/{id}/get."""

    class SimpleIO:
        input_required = (Integer('id'))
        output_optional = ('id', 'floor_no', 'room_no', 'sgl_beds', 'dbl_beds',
                           'supplement', 'code', 'name', 'accommodates',
                           'number')
        skip_empty_keys = True

    def handle(self):
        conn = self.user_config.genesisng.database.connection
        id_ = self.request.input.id

        with closing(self.outgoing.sql.get(conn).session()) as session:
            result = session.query(Room).\
                filter(and_(Room.id == id_, Room.deleted.is_(None))).\
                one_or_none()

            if result:
                self.response.status_code = OK
                self.response.payload = result
            else:
                self.response.status_code = NOT_FOUND

This service is working fine right now, and both hybrid properties number and accommodates are being returned as part of the output JSON.

But if I change the code doing the query to this:

        with closing(self.outgoing.sql.get(conn).session()) as session:
            result = session.query(Room, func.count().over().label('count')).\
                filter(and_(Room.id == id_, Room.deleted.is_(None))).\
                one_or_none()

Then the response is coming back empty:

$ curl -v -g "http://127.0.0.1:11223/genesisng/rooms/1/get"; echo ""
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 11223 (#0)
> GET /genesisng/rooms/1/get HTTP/1.1
> Host: 127.0.0.1:11223
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: Zato
< Date: Fri, 16 Nov 2018 23:00:38 GMT
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: application/json
< X-Zato-CID: 532d7925656f4fc7dadd2b13
< 
* Closing connection 0
{"response": {}}

No error on the logs, just no keys/values in the output JSON. And if I add count to the output_optional tuple, then the output only contains the count key/value pair.

This is the modified service:

class Get(Service):
    """Service class to get a room by id."""
    """Channel /genesisng/rooms/{id}/get."""

    class SimpleIO:
        input_required = (Integer('id'))
        output_optional = ('id', 'floor_no', 'room_no', 'sgl_beds', 'dbl_beds',
                           'supplement', 'code', 'name', 'accommodates',
                           'number', 'count')
        skip_empty_keys = True

    def handle(self):
        conn = self.user_config.genesisng.database.connection
        id_ = self.request.input.id

        with closing(self.outgoing.sql.get(conn).session()) as session:
            result = session.query(Room, func.count().over().label('count')).\
                filter(and_(Room.id == id_, Room.deleted.is_(None))).\
                one_or_none()

            if result:
                self.response.status_code = OK
                self.response.payload = result
            else:
                self.response.status_code = NOT_FOUND

And this is the new response:

$ curl -v -g "http://127.0.0.1:11223/genesisng/rooms/1/get"; echo ""
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 11223 (#0)
> GET /genesisng/rooms/1/get HTTP/1.1
> Host: 127.0.0.1:11223
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: Zato
< Date: Fri, 16 Nov 2018 23:04:17 GMT
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: application/json
< X-Zato-CID: 5ac38c7d2368a88958083cb8
< 
* Closing connection 0
{"response": {"count": 1}}

The query being executed on PostgreSQL is this:

2018-11-17 00:04:17.397 CET [22719] genesisng@genesisng LOG:  execute pg8000_statement_136_14: SELECT room.id AS room_id, room.floor_no AS room_floor_no, room.room_no AS room_room_no, room.name AS room_name, room.sgl_beds AS room_sgl_beds, roo
m.dbl_beds AS room_dbl_beds, room.supplement AS room_supplement, room.code AS room_code, room.deleted AS room_deleted, count(*) OVER () AS count  
        FROM room  
        WHERE room.id = $1 AND room.deleted IS NULL 
2018-11-17 00:04:17.397 CET [22719] genesisng@genesisng DETAIL:  parameters: $1 = '1'

Thanks.


#4

This looks to be a different situation than the one with Cols[key] before,

When you assign result to self.response.payload Zato, ultimately function set_payload_attrs is called, which is defined in this Python module, around line 320.

This is the function that extracts names for keys to be created in responses - please spend some time with it to diagnose why when you provide it a Room vs Room, count(), the latter does not result in the extraction of the said names; if you find anything, I will be happy to include it in one of updates.


#5

Hi, @dsuch. I’ll investigate what happens inside set_payload_attrs and get back to you as soon as possible. Thanks :slight_smile:


#6

Hi, @dsuch.

I’ve finally been able to find some time to try to find what’s happening and here is where I am right now.

SQLAlchemy returns a variety of types depending on what you are querying

  • Case 1:
session.query(Room).\
    filter(and_(Room.id == id_, Room.deleted.is_(None))).\
    one_or_none()

Result type is: <class ‘genesisng.schema.room.Room’>
Output is: <Room(id=‘1’, name=‘Normal bedroom with two single beds’, number=‘101’, accommodates=‘2’)>

  • Case 2:
result = session.query(Room, func.count().over().label('count')).\
   filter(and_(Room.id == id_, Room.deleted.is_(None))).\
   one_or_none()

Result type is: <class 'sqlalchemy.util._collections.result'>
Output is: (<Room(id='1', name='Normal bedroom with two single beds', number='101', accommodates='2')>, 1L)
Result keys are: ['Room', u'count']

  • Case 3:
query = session.query(func.count().over().label('count'))
query = query.add_columns(*Room.__table__.columns)
query = query.offset(0)
query = query.limit(2)
result = query.all()

Output is: [(6L, 1, 1, 1, u'Normal bedroom with two single beds', 2, 0, 20.0, u'pink', None), (6L, 2, 1, 2, u'Large bedroom with two single and one double beds', 2, 1, 40.0, u'black', None)]
Result type is: <type 'list'>
Result elements type is: <class 'sqlalchemy.util._collections.result'>

If you transform result elements into rooms by using dictalchemy:

d = e._asdict()
r = Room()
r.fromdict(d, allow_pk=True)

Then you get, as expected, rooms:

(6L, 1, 1, 1, u'Normal bedroom with two single beds', 2, 0, 20.0, u'pink', None)
(6L, 2, 1, 2, u'Large bedroom with two single and one double beds', 2, 1, 40.0, u'black', None)

I have been able to find out that sqlalchemy.util._collections.result is actually a KeyedTuple (defined at sqlalchemy.util).

Next thing I am trying to do is to see what is happening when setting the payload, as you hinted.

I’ve found the setter and the getter of the payload attribute at line 561:

payload = property(_get_payload, _set_payload)

But I have not been able to find where and how the value of value is created/defined before passing it to _set_payload(self, value) in line 543.

Could you please help me track the code?

Incidentally, the only way to get hybrid properties to be returned when defined at output_required or output_optional is to work with model classes, so the whole fields projection I planned to support in the prototype may be something to discard. But that is another subject.


#7

So I’ve been able to spend more time with this matter and I’ve come to the conclusion that it’s not worth it to modify the way it currently works with SQLAlchemy result sets when being assigned to payloads. Let’s elaborate.

On the one hand, in short, a query to SQLAlchemy using the ORM Query API may return the following result sets:

  1. A model class (e.g. a single genesisng.schema.room.Room):
    session.query(Room).\
    filter(and_(Room.id == id_, Room.deleted.is_(None))).\
    one_or_none()
  1. A model class (e.g. a single genesisng.schema.room.Room) and something else (e.g. an integer count, which, admittedly does not make sense here, but think of the number of whatever you may require):
    session.query(Room, func.count().over().label('count')).\
    filter(and_(Room.id == id_, Room.deleted.is_(None))).\
    one_or_none()
  1. A number of model classes due to joins (e.g. a list of tuples with genesisng.schema.room.Room, genesisng.schema.booking.Booking and genesisng.schema.guest.Guest) and maybe something else (e.g. an integer count):
    session.query(Booking, Room, Guest, func.count().over().label('count')).\
    filter(Booking.id_room == Room.id).\
    filter(Booking.id_guest == Guest.id).\
    all()
  1. A number of columns of different types due to a select on a model class or due to joins (e.g. a list of tuples with all/some columns of the model class genesisng.schema.room.Room)
    session.query(func.count().over().label('count')).\
    add_columns(*Room.__table__.columns).\
    all()

Or a combination/variation of the above.

A few notes:

  1. Only model classes can be inspected (i.e. from sqlalchemy import inspect) as they have the _sa_class_manager attribute, not sqlalchemy.util._collections.result.
  2. When one queries for anything more than a single model class, one gets a sqlalchemy.util._collections.result (or a list of them), which is a fancy name for a sqlalchemy.util.KeyedTuple (which is a SQLAlchemy custom type), created on the fly by SQLAlchemy.
  3. If you navigate into the sqlalchemy.util._collections.result, then you find the expected model classes and other types (an integer in case of the count, different types if retrieving columns).
  4. When you query for columns you don’t get a model class in the result, but you can build it by using dictalchemy, but you need to know what’s inside the result set (how to navigate it, thus difficult to automate, if not impossible).
  5. You can access hybrid properties when working with model classes (as expected), but not when working with columns. Supposedly, you should be able to define expressions on the hybrid properties so that you could query for those columns, but I have not been able to make it work (see opening post in this thread for the expressions on accommodates and number).

On the other hand, SimpleIO, in its version 3, allows you to define a set of attributes that are mandatory to be sent as part of the output (output_required) and another set which are optional (output_optional). These sets of attributes, currently, do not allow complex, multilevel representations (i.e. ListOfDicts or Dict types, which have been declared but not defined/implemented).

So, when we have a SimpleIO service that queries a database through SQLAlchemy, if we are getting one entity (e.g. genesisng.schema.room.Room), then we can assign the result from SQLAlchemy directly to the payload (self.response.payload) and Zato will convert the model class into a dictionary (precisely, line 327), including the hybrid properties of the model class, should you request them in the output_required or output_optional parameters.

When we have anything else (see the cases above), there is no point in altering the current code of the set_payload_attrs() function unless you have a way to match the output attributes to what’s inside the result set (i.e. tell the Zato code what to look for). For instance, should you query for:

    session.query(Booking, Room, Guest, func.count().over().label('count')).\
    filter(Booking.id_room == Room.id).\
    filter(Booking.id_guest == Guest.id).\
    all()

You would be getting this as the result set:

[
    <class 'sqlalchemy.util._collections.result'>
        with keys ['Booking', 'Room', 'Guest', u'count'],
    <class 'sqlalchemy.util._collections.result'>
        with keys ['Booking', 'Room', 'Guest', u'count'],
    ...
]

Thus we would require to be able to declare output_optional in a way similar to this (short, meta-code version):

output_optional = (Booking, Room, Guest, count)
output_repeated = True

Or to this (longer, not-so-meta-code version):

output_optional = (Dict(booking, id, id_guest, id_room, DateTime(reserved), ...),
                   Dict(room, id, floor_no, room_no, ..., accommodates, number),
                   Dict(guest, id, name, surname, gender, Date(birthdate), ...),
                   count)
output_repeated = True

Now should we want custom made, standard structures in the output of all our services, such as:

{
    "response": {
        "data": {},
        "meta": {},
        "error": {},
    }
}

We would have to define it in our output_optional attribute:

output_optional = (ListOfDicts(data, ...), Dict(meta), Dict(error))

An example for a service that requests all bookings from a given guest (by id = 1) could look like this:

output_optional = (ListOfDicts(data, ListOfDicts(booking, Booking),
                                     ListOfDicts(room, Room),
                                     Dict(guest, Guest)),
                   Dict(meta), Dict(error))

Or the long version with all the (desired) attributes of each model class. And the responses could look like this:

{
    "response": {
        "data": {
            "client": {
                "id": 1,
                ...
            },
            "rooms": [
                {"id": 1, "floor_no": 2, "room_no": 14, ...},
                {"id": 7, "floor_no": 3, "room_no": 10, ...},
           ],
           "bookings: [
               { "id": 1, "id_guest": 1, "id_room": 1, "reserved": "2018-12-25", ...},
               { "id": 15, "id_guest": 1, "id_room": 7, "reserved": "2018-12-26", ...},
               ...
           ]
        },
        "meta": {
            "page_number": 1,
            "page_count": 2,
            "page_size": 5,
            ...
        },
        "error": {
            "code": ...
            "message": ...
        }
    }
}

So, as a conclusion, I believe that SimpleIO version 3, as it is now, does a good but limited job and expanding the features of SimpleIO is not about hacking the set_payload_attrs() function but thinking of version 4 of SimpleIO (as I believe is already happening).

What do you think, @dsuch?


#8

Hi @jsabater,

yes, I think in the end leaving it to the new SimpleIO to handle it will be most convenient and productive.

This is something that I am doing anyway so getting into set_payload_attrs now, which is on the lower level of things, would require too much time.

Regards.


#9

Just FYI - the new SimpleIO, still in progress, can now nest Dict and DictList instances to form arbitrary hierarchies of input or output elements.

You can look up ‘test_parse_nested_dict_customer_no_defaults’ in the unittests here to observe how it works.

A related idea that has occurred to me is that it would perhaps be a good time to introduce Django-like models to SimpleIO but it depends on how much more time it is possible to allocate to the rewrite; there are other needs, such as Python 3 support, too.


#10

As per my reasoning in the post above, I believe that to be the solution to a higher level form of SimpleIO. Hard to say if it’s ever going to be high enough, though. That being said, I understand that time and resources are limited and I agree that migrating to Python 3 can have a multiplying effect :slight_smile:


#11

I have more than a half of the rewrite done but data models will not be part of it - I can already estimate that there will not be time before the next release in June.

The most important part of the new SimpleIO is that it is designed and implemented in a plugin-oriented manner, including most of what is built in. Over time, it will be easy to add user-defined data formats or elements.

Originally, I meant for it to be backported to 3.0 but this is too big a change. It is completely backward-compatible but still, it should not be included as one of regular updates to support/3.0.


#12

Bit by bit :wink:

[reporter mode on] What else is being cooked? :slight_smile:


#13

The SimpleIO refactoring, Python 3 support and closer integration with Docker (Compose, Kubernetes) are in for sure. New connection types - SFTP, LDAP (Active Directory) and MS SQL Server.

With SFTP in place, some kind of a file transfer solution with s GUI will be potentially possible seeing as we already have FTP and FTPS.

Other than that - more support for deployments to/with AWS and HashiCorp Terraform.

New tutorials, including a reworked API/SOA one and publish/subscribe topics will be included too.

As with each new release, Zato will continue to be faster and leaner - for instance, the new SimpleIO framework is written in Cython.


Plugin mechanism for outgoing connectors
#14

Lovely to have better integration with container technologies, and new connection types. I agree this has priority over entities in SimpleIO :cherries:


#15

I don’t want to derail even more this thread to the future platform updates, but I don`t even need to express how happy I am to hear Python 3, SFTP, FTP and FTPS in the same post.

Considering my current scenarios I would love if the platform streamlines the different types of remote access and without too much code change to replace an FTP for an SFTP, for example, with an abstract layer between the direct library access. This could also open up new possibilities for new libraries to be easily integrated by us, instead of relying on externalizing them without using the Zato resources.

A Zato file managed workflow would be a nice addition but since I already have this implemented in my services, I personally don’t have any rush in replacing such systems. So if this is an use case not so relevant immediately to others, you could postpone it easily instead of creating something for the initial release.

Also hoping for an update in the ‘fs’ library during the migration to the Python 3 version, since I have some suspicion of errors on my remote calls being caused by this lib which has several years of updates compared to the version used in Zato. Hopefully a more recent version will fare better (although the API is vastly changed and maybe it will represent a lot of additional work for you).

Thanks for the excellent news and if I could help in any way on providing more information for such features and relevant use cases, just get in touch!