4 Many DAGs
While create_dag
is great for creating a single DAG, part of what makes gusty so convenient is that creating any number of new DAGs can be as easy as just making a new folder in a directory. Once you get to the point where you’re just creating new folders full of Task Definition Files and metadata, you and your team get to think about Airflow less and can instead focus on defining the core components of your workflows.
To help facilitate this growth, gusty also provides a create_dags
function, for generating multiple DAGs. With create_dags
, instead of passing a path to a single DAG folder, you’ll pass in a directory path where many DAG folders reside.
In the example below, we’ll make a “home” for all of our gusty DAGs inside a directory named gusty_dags
. Inside the gusty_dags
directory are two DAGs, hello_dag
and goodbye_dag
.
$AIRFLOW_HOME/dags/
│
├── gusty_dags/
│ │
│ ├── goodbye_dag/
│ │ ├── METADATA.yml
│ │ └── goodbye.yml
│ │
│ └── hello_dag/
│ ├── METADATA.yml
│ └── hello.yml
│
│
└── gusty_dags.py
4.1 Using create_dags
Now, we’ll use the create_dags
function in gusty_dags.py
to generate multiple DAGs in a single file! Here’s what our gusty_dags.py
file looks like:
import os
from gusty import create_dags
from gusty.utils import days_ago
# gusty_dags_dir returns something like: "/usr/local/airflow/dags/gusty_dags"
= os.path.join(
gusty_dags_dir "AIRFLOW_HOME"],
os.environ["dags",
"gusty_dags")
create_dags(
gusty_dags_dir,globals(),
="0 0 * * *",
schedule=False,
catchup={
default_args"owner": "you",
"email": "you@you.com",
"start_date": days_ago(1)
},={
wait_for_defaults"mode": "reschedule"
},=["gusty_dags"],
extra_tags=False) latest_only
The above will create both hello_dag
and goodbye_dag
DAGs, which reside inside of the gusty_dags_dir
defined in gusty_dags.py
.
The second argument, globals()
, assigns the DAGs to the global environment, so Airflow can find the DAGs.
schedule
, catchup
, default_args
are arguments available in the Airflow DAG object.
wait_for_defaults
, extra_tags
, and latest_only
are all gusty-specific create_dag
arguments. wait_for_defaults
and latest_only
were previously discussed here and here. extra_tags
are additional tags appended to any existing tags
specified in either create_dag
or a METADATA.yml
file.
4.2 The Power of create_dags
The value in create_dags
is that multiple DAGs can be created with common schedules, default arguments, tags, and more, plus each DAG can contain DAG-specific information, such as documentation (e.g. description
and doc_md
) and tags, inside their own METADATA.yml
.
In gusty, METADATA.yml
takes precedence over any create_dag
argument, so you can override anything set in create_dags
with the DAG-specific METADATA.yml
.
Now you have the building blocks to use file-oriented orchestration in Airflow with gusty!