Skip to content

Query module: reduce equivalent encapsulated conditions #430

@sadielbartholomew

Description

@sadielbartholomew

Whilst investigating #427 I noticed that trivial equal and/or equivalent conditions in a complex condition are listed and considered separately. E.g, using a clean as-is lama-to-dask branch (hence including this in Dask/post-Dask scope of work), I see:

>>> import cf
>>> # Generating a trivially-redundant query condition:
>>> q = cf.lt(9)
>>> c = q & q & q
>>> c
<CF Query: [[(lt 9) & (lt 9)] & (lt 9)]>
>>> # Same goes with different objects but equal queries:
>>> p = cf.lt(9)
>>> q.equals(p)
True
>>> c = q & p
>>> c
<CF Query: [(lt 9) & (lt 9)]>
>>> # Now with unit considerations to generate some equivalency:
>>> r = cf.lt(9000)
>>> q.set_condition_units('km')
>>> r.set_condition_units('m')
>>> q
<CF Query: (lt 9.0 km)>
>>> r
<CF Query: (lt 9000 m)>
>>> q.equals(r)
False
>>> q.equivalent(r)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/sadie/cf-python/cf/query.py", line 822, in equivalent
    _DEPRECATION_ERROR_FUNCTION(self, "equivalent")
  File "/home/sadie/cf-python/cf/functions.py", line 3205, in _DEPRECATION_ERROR_FUNCTION
    raise DeprecationError(
cf.functions.DeprecationError: Function <CF Query: (lt 9.0 km)> has been deprecated at version 3.0.0 and is no longer available and will be removed at version 4.0.0. equivalent
>>> # Not sure why equivalent has been deprecated here? What can I use instead?
>>> s = q & r
>>> s
<CF Query: [(lt 9.0 km) & (lt 9000 m)]>

(the same applies on the master branch, essentially, but set_condition_units is a method new to the lama-to-dask branch so the snippet won't work in that case and in that sense). @davidhassell note also the question in the final comment!

Whilst it is unlikely that a user would directly end up having equivalent conditions inside their query, perhaps through accumulation of a combination of complex conditions it might occur and not be obvious, so may not be too uncommon - regardless, I think we should, ideally, simplify such conditions as they are built up, e.g. taking the above example c = q & q & q would become c = q via c = (q & q) & q -> c = q & q, with more complex expressions being reduced piece-by-piece in the same way.

Note I am not suggesting we pre-evaluate the queries themselves, only checking against the logical equivalency of the conditions encapsulated, as the query is built.

Is this a good idea, conceptually? Also I imagine it is possible without much if any boilerplate code, at least, but if it does require a lot then, even if conceptually desired, maybe it should not be applied.

(This is not an urgent or ground-breaking enhancement, but something that can be improved somewhat.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    daskRelating to the use of DaskenhancementNew feature or requestquestionGeneral question

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions