How to remove default example dags in airflow
PythonAirflowPython Problem Overview
I am a new user of Airbnb's open source workflow/datapipeline software airflow. There are dozens of default example dags after the web UI is started. I tried many ways to remove these dags, but I've failed to do so.
load_examples = False
is set in airflow.cfg.- Folder lib/python2.7/site-packages/airflow/example_dags is removed.
States of those example dags are changed to gray after I removed the dags folder, but the items still occupy the web UI screen. And a new dag folder is specified in airflow.cfg as dags_folder = /mnt/dag/1
. I checked this dag folder, nothing is there. It's really weird to me why it is so difficult to remove these examples.
Python Solutions
Solution 1 - Python
When you startup airflow, make sure you set:
load_examples = False
inside your airflow.cfg
If you have already started airflow with this not set to false, you can set it to false and run airflow resetdb
in the cli (!which will destroy all current dag information!).
Alternatively you can go into the airflow_db
and manually delete those entries from the dag
table.
Solution 2 - Python
For Airflow 2.0, in docker-compose.yaml you can set AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
to not load them instead of editing the .cfg file.
Solution 3 - Python
Like others have said, you can change load_examples = False
within airflow.cfg
. However this requires that the cfg file already existing.
You can init the airflow DB without having to configure the cfg file by using ENV variables.
export AIRFLOW__CORE__LOAD_EXAMPLES=False
airflow initdb
See docs for more information.
Solution 4 - Python
Before you start airflow make sure you set load_example
variable to False in airflow.cfg
file. By default it is set to True.
load_examples = False
If you have already started airflow, you have to manually delete example DAG from the airflow UI. Click on delete icon available on the right side of the DAG to delete it.
Instead of manually deleting example DAG, you can reset your database by using airflow resetdb
command but that will delete your connections and variables and other important information. Do not use airflow resetdb
option in production.
Solution 5 - Python
Easy Way:
-
put ..
load_examples = False
into the airflow.cfg file
- then close and restart the webserver and scheduler
Solution 6 - Python
while starting up the airflow make sure to change
load_examples = False
in the airflow.cfg
file .Then close and restart the webserver and scheduler.
Solution 7 - Python
TL;DR: check that you have only DAG files in your dags_folder
-- Airflow will traverse this directory recursively and try to load all .py
files.
I've lost some time debugging similar behaviour of Airflow: even though load_examples = False
, airflow was still loading tons of unnecessary stuff, including example_dags. The problem was that I was having a virtualenv directory venv/
in dags_folder
, and -- I was not expecting that -- Airflow searches for dags recursively in the dags dir. So it was loading example_dags from the apache-airflow installed in that virtualenv.
UPD: there's a .airflowignore
file to ignore directories from dags_folder
Solution 8 - Python
If LDAP based authentication is turned on after airflow resetdb and restarting airflow your login window may not appear and may give an error due to cached login id ( but no password ). If this happens, clear your cache and try. If that still doesn't work, turn off authentication, stop airflow and start. Then turn on authentication stop and start - you will be able to see login window and login with your LDAP authentication -Suresh