I am attempting to use SQLAlchemy more fully, rather than just falling back to pure SQL at the first sign of distress.  In this case, I have a table in a Postgres database (9.5) which stores a set of integers as a group by associating individual items atom_id with a group identifier group_id.
Given a list of atom_ids, I'd like to be able to figure out which group_id, if any, that set of atom_ids belong to.  Solving this with just the group_id and atom_id columns was straightforward.
Now I'm trying to generalize such that a 'group' is made up of not just a list of atom_ids, but other context as well.  In the example below, the list is ordered by including a sequence column, but conceptually other columns could be used instead, such as a weight column which gives each atom_id a [0,1] floating point value representing that atom's 'share' of the group.
Below is most of a unit test demonstrating my issue.
First, some setup:
def test_multi_column_grouping(self):
    class MultiColumnGroups(base.Base):
        __tablename__ = 'multi_groups'
        group_id = Column(Integer)
        atom_id = Column(Integer)
        sequence = Column(Integer)  # arbitrary 'other' column.  In this case, an integer, but it could be a float (e.g. weighting factor)
    base.Base.metadata.create_all(self.engine)
    # Insert 6 rows representing 2 different 'groups' of values
    vals = [
        # Group 1
        {'group_id': 1, 'atom_id': 1, 'sequence': 1},
        {'group_id': 1, 'atom_id': 2, 'sequence': 2},
        {'group_id': 1, 'atom_id': 3, 'sequence': 3},
        # Group 2
        {'group_id': 2, 'atom_id': 1, 'sequence': 3},
        {'group_id': 2, 'atom_id': 2, 'sequence': 2},
        {'group_id': 2, 'atom_id': 3, 'sequence': 1},
    ]
    self.session.bulk_save_objects(
        [MultiColumnGroups(**x) for x in vals])
    self.session.flush()
    self.assertEqual(6, len(self.session.query(MultiColumnGroups).all()))
Now, I want to query the above table to find which group a specific set of inputs belongs to. I'm using a list of (named) tuples to represent the query parameters.
    from collections import namedtuple
    Entity = namedtuple('Entity', ['atom_id', 'sequence'])
    values_to_match = [
        # (atom_id, sequence)
        Entity(1, 3),
        Entity(2, 2),
        Entity(3, 1),
        ]
    # The above list _should_ match with `group_id == 2`
Raw SQL solution. I'd prefer not to fall back on this, as a part of this exercise is to learn more SQLAlchemy.
    r = self.session.execute('''
        select group_id
        from multi_groups
        group by group_id
        having array_agg((atom_id, sequence)) = :query_tuples
        ''', {'query_tuples': values_to_match}).fetchone()
    print(r)  # > (2,)
    self.assertEqual(2, r[0])
Here is the above raw-SQL solution converted fairly directly into a 
broken SQLAlchemy query.  Running this produces a psycopg2 error: (psycopg2.ProgrammingError) operator does not exist: record[] = integer[].  I believe that I need to cast the array_agg into an int[]? That would work so long as the grouping columns are all integers (which, if need be, is an acceptable limitation), but ideally this would work with mixed-type input tuples / table columns.
    from sqlalchemy import tuple_
    from sqlalchemy.dialects.postgresql import array_agg
    existing_group = self.session.query(MultiColumnGroups).\
        with_entities(MultiColumnGroups.group_id).\
        group_by(MultiColumnGroups.group_id).\
        having(array_agg(tuple_(MultiColumnGroups.atom_id, MultiColumnGroups.sequence)) == values_to_match).\
        one_or_none()
    self.assertIsNotNone(existing_group)
    print('|{}|'.format(existing_group))
Is the above session.query() close?  Have I blinded myself here, and am missing something super obvious that would solve this problem in some other way?


