-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Change what we do with manually triggered dag_runs to being pruned too. #13480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change what we do with manually triggered dag_runs to being pruned too. #13480
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @apilaskowski, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request updates the Airflow database cleanup script to incorporate manually triggered DAG runs into the pruning process. The change ensures that all types of DAG runs, including those initiated externally, are subject to the same age-based data retention policies, addressing a customer need to manage the overall data footprint more comprehensively.
Highlights
- DagRun Pruning Logic Enhancement: Modified the
airflow_db_cleanup.py
script to extend the pruning logic to include manually triggered DAG runs. Previously, these runs were explicitly excluded from age-based cleanup. - Universal Age-based Filtering: Ensured that the age-based pruning filter (
age_check_column <= max_date
) is applied universally to all database models, includingDagRun
entries, by moving it outside of conditional blocks specific toDagRun
. - Code Simplification: Simplified the
build_query
function by removing redundant conditional logic and consolidating the age-based filter, making the pruning mechanism more consistent and easier to understand.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the Airflow database cleanup DAG to prune manually triggered dag_run
records, which were previously exempt from cleanup. The implementation correctly adjusts the query logic to include these runs while preserving the newest scheduled run for each DAG to prevent unintended re-triggers. The associated refactoring also simplifies the code by consolidating the date-based filtering. The changes appear correct and align with the stated goal.
@rafalbiegacz please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good to me.
Description
We decided to change the way dag_runs triggered externally (not by schedule) are treated.
Previously they were skipped completely, but some customers are manually triggering DAGs and want to prune those informations too.
Now those dag_runs will get pruned the same way scheduled dag_runs are.
Checklist
nox -s py-3.9
(see Test Environment Setup)nox -s lint
(see Test Environment Setup)