add readonly endpoints for dagruns#9153
Conversation
There was a problem hiding this comment.
| # This endpoint allows specifying ~ as the dag_id to retrieve DAG Runs for all DAGs. | |
| if dag_id == '~': | |
| dag_run = query.all() | |
| return dagrun_collection_schema.dump(DAGRunCollection( | |
| dag_runs=dag_run, | |
| total_entries=len(dag_run)) | |
| ) | |
| query = query.filter(DagRun.dag_id == dag_id) | |
| if dag_id != '~': | |
| query = query.filter(DagRun.dag_id == dag_id) |
We still want to have filters and paginations. ~ is a wildcard. It allows any value, so we should give up filtering in this field.
There was a problem hiding this comment.
| if start_date_gte and start_date_lte: | |
| query = query.filter(DagRun.start_date <= start_date_lte, | |
| DagRun.start_date >= start_date_gte) | |
| elif start_date_gte and not start_date_lte: | |
| query = query.filter(DagRun.start_date >= start_date_gte) | |
| elif start_date_lte and not start_date_gte: | |
| query = query.filter(DagRun.start_date <= start_date_lte) | |
| if start_date_gte: | |
| query = query.filter(DagRun.start_date >= start_date_gte) | |
| if start_date_lte: | |
| query = query.filter(DagRun.start_date <= start_date_lte) |
Is this causing any problems?
There was a problem hiding this comment.
Why do we have a string here? The specification define object here.
There was a problem hiding this comment.
| self.assertEqual(response.json.get('total_entries'), 2) | |
| self.assertEqual(response.json['total_entries'], 2) |
There was a problem hiding this comment.
I would prefer to use a fixed date in tests. This allows us to use text strings in tests. Now it is very difficult for me to check if the data format is correct.
There was a problem hiding this comment.
What do you think about webargs library? Would it be helpful to you?
There was a problem hiding this comment.
I found that this can solve it for us without any validation code from our side:
app.add_api("openapi.yaml", strict_validation=True)
https://connexion.readthedocs.io/en/latest/request.html#parameter-validation
There was a problem hiding this comment.
I will be using webargs. Connexion no longer validate date-format because of license of one of its library.
spec-first/connexion#476
There was a problem hiding this comment.
I'm glad you did research. We can change it in a separate change. Now it works, so it's great. In next changes, we will be able to make a refactor and remove repetitive code fragments.
There was a problem hiding this comment.
It should. be defined as a custom fields. This allows you to eliminate some hacks from the code.
class DagStateField(fields.String):
def __init__(self, **metadata):
super().__init__(**metadata)
self.validators = (
[validate.OneOf(State.dag_states)] + list(self.validators)
)There was a problem hiding this comment.
| session.add_all(dagruns) | |
| session.add_all(dagruns) | |
| dagruns[0].dag_id.= "TEST_DAG_ID_2" |
Can you check here if two different DAGs can be fetched?
There was a problem hiding this comment.
~ means that you can fetch any DAG ID. Your assertion should check if you have two DAGs e.g. DAG A and DAG B, you can fetch them. For now, you only have one DAG, which does not allow you to check the behavior of this view.
There was a problem hiding this comment.
What do you think to assert the start/execution dates here? Identifiers poorly describe these objectshere.
There was a problem hiding this comment.
| ["TEST_START_EXEC_DAY_10"], | |
| ["TEST_START_EXEC_DAY_10", "TEST_START_EXEC_DAY_11"], |
We have an inclusive filter, so it should probably be included as well. However, this may change in the future.
Related issue:
#9237
There was a problem hiding this comment.
This seems like a problem with <= operator in sqlalchemy. It returns only the less than values.
I have also tried separating the operator and using the or_ operator but it's still not inclusive
I am wondering if there is another way I could do it?
There was a problem hiding this comment.
I have a day off today, so I have limited options to check it out, but it can help you.
https://github.com/apache/airflow/blob/master/TESTING.rst#tracking-sql-statements
If something can be done at the SQL level, then SQL can do it too.
There was a problem hiding this comment.
Thanks for the link, it helped.
What was happening was that on the view, dates that have this format '2020-06-11T18:00:00+00:00' are changed to '2020-06-11T18:00:00 00:00'. That is, the + are replaced with empty space. So when it is parsed, only the date part were parsed. I then replaced +00:00 with Z and the problem was solved.
There was a problem hiding this comment.
This is the command I used to investigate:
pytest --trace-sql=num,sql,parameters --capture=no tests/api_connexion/endpoints/test_dag_run_endpoint.py -k test_date_filters_gte_and_lte_1_api_v1_dags_TEST_DAG_ID_dagRuns_start_date_lte_2020_06_11T18_00_00_00_00
I set pdb debuger at this point in dag_run_endpoint:
if start_date_lte: import pdb; pdb.set_trace() query = query.filter(DagRun.start_date <= timezone.parse(start_date_lte))
Check image below:
|
I would like to review this change once more and look at the filter by dates, but today I see that there is a merge conflict. if you have moments could you do a rebase? |
mik-laj
left a comment
There was a problem hiding this comment.
One test is sad. We need to fetch sorted items from the database.
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>

Closes: #8129
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.