5 Using Constructors
5.1 What are Constructors?
Constructors are functions you can invoke in your YAML. These functions are invoked every time your Task Definition File is loaded during gusty’s DAG creation process.
Constructors are available to us thanks the PyYAML package.
To better understand constructors, let’s orient ourselves around a simple Python function, called double_it
:
def double_it(x):
return x + x
If we were to run double_it(2)
, we’d get back 4
.
To invoke double_it
from YAML, we begin our value entry with an exclaimation point (!
), as illustrated below:
some_argument: !double_it 2
When this YAML is loaded, the argument some_argument
in our YAML will be assigned the value 4
.
You can also use keyword arguments (i.e. double_it(x=2)
) with constructors:
some_argument: !double_it
x: 2
The above will still result in some_argument
taking on the value of 4
.
5.2 Using Constructors with gusty
gusty makes it easy for you to leverage YAML contructors. The simplest way to leverage your functions as YAML constructors within gusty is to use the Airflow DAG object’s built-in user_defined_macros
argument. When you pass a dictionary of functions/macros to user_defined_macros
, gusty will make all of those functions/macros available to you as YAML constructors.
Your call to create_dag
might look something like this:
create_dag(
...,={
user_defined_macros"double_it": double_it
} )
Then, in a Task Definition File, you could leverage double_it
both as a YAML constructor, as well as - just as in any other Airflow task - using Jinja. Here’s a BashOperator
example below.
operator: airflow.operators.bash.BashOperator
retries: !double_it 4
bash_command: echo {{ double_it("hello") }}
The above would result in a task with 8 retries and a bash command that (when executed) would echo hellohello
.
An important note on the timing of function evaluation: double_it
is used twice above, once as a YAML constructor in the retries
argument and once as a Jinja macro in the bash_command
argument. The YAML constructor will be evaluated every time the DAG is generated, which is once every few minutes by default (in Airflow). The Jinja macro will only be evaluated when the task is executed.
5.3 Built-in Constructors
gusty
There are a few built-in constructors gusty contains, primarily to make creating a DAG using METADATA.yml
easy. The three built-in constructors are datetime, timedelta, and days_ago
, which simply provides a datetime object for as many days ago you specify.
ABSQL
The YAML loading functionality for gusty is maintained in a separate, lightweight project called ABSQL.
The ABSQL package ships with a handful of default functions, which are also available to you as both YAML constructors and macros within gusty DAGs.