5  Using Constructors

5.1 What are Constructors?

Constructors are functions you can invoke in your YAML. These functions are invoked every time your Task Definition File is loaded during gusty’s DAG creation process.

Constructors are available to us thanks the PyYAML package.

To better understand constructors, let’s orient ourselves around a simple Python function, called double_it:

def double_it(x):
  return x + x

If we were to run double_it(2), we’d get back 4.

To invoke double_it from YAML, we begin our value entry with an exclaimation point (!), as illustrated below:

some_argument: !double_it 2

When this YAML is loaded, the argument some_argument in our YAML will be assigned the value 4.

You can also use keyword arguments (i.e. double_it(x=2)) with constructors:

some_argument: !double_it
  x: 2

The above will still result in some_argument taking on the value of 4.

5.2 Using Constructors with gusty

gusty makes it easy for you to leverage YAML contructors. The simplest way to leverage your functions as YAML constructors within gusty is to use the Airflow DAG object’s built-in user_defined_macros argument. When you pass a dictionary of functions/macros to user_defined_macros, gusty will make all of those functions/macros available to you as YAML constructors.

Your call to create_dag might look something like this:

create_dag(
  ...,
  user_defined_macros={
    "double_it": double_it
  }
)

Then, in a Task Definition File, you could leverage double_it both as a YAML constructor, as well as - just as in any other Airflow task - using Jinja. Here’s a BashOperator example below.

operator: airflow.operators.bash.BashOperator
retries: !double_it 4
bash_command: echo {{ double_it("hello") }}

The above would result in a task with 8 retries and a bash command that (when executed) would echo hellohello.

An important note on the timing of function evaluation: double_it is used twice above, once as a YAML constructor in the retries argument and once as a Jinja macro in the bash_command argument. The YAML constructor will be evaluated every time the DAG is generated, which is once every few minutes by default (in Airflow). The Jinja macro will only be evaluated when the task is executed.

5.3 Built-in Constructors

gusty

There are a few built-in constructors gusty contains, primarily to make creating a DAG using METADATA.yml easy. The three built-in constructors are datetime, timedelta, and days_ago, which simply provides a datetime object for as many days ago you specify.

ABSQL

The YAML loading functionality for gusty is maintained in a separate, lightweight project called ABSQL.

The ABSQL package ships with a handful of default functions, which are also available to you as both YAML constructors and macros within gusty DAGs.