PiPy installation
Installation of Apache Airflow on Python Virtual Environment
In this tutorial you will find steps to install apache airflow on python virtual environment. You can follow the document and watch video which provides more details on the installation.
Video To Follow
Steps To Install
Update Ubuntu/Debian
sudo apt update -y & sudo apt upgrade -yCreate app directory for superset and dependencies
sudo mkdir /app sudo chown user /app cd /appCreate python environment
mkdir airflow cd airflow python3 -m venv airflow_env . airflow_env/bin/activate pip install --upgrade setuptools pipInstall Apache Airflow
pip install apache-airflow apache-airflow-providers-googleSetup environment variable
export AIRFLOW_HOME=/app/airflowCreate Airflow Configuration file
airflow config list --defaults > "${AIRFLOW_HOME}/airflow.cfg"Configure Airflow configuration file if needed such as Airflow metadata db if you want to use any other database than sqlite. If you use other db then you can also change executor to
LocalExecutorin config file. In the config file this will be commented you can uncomment this and useLocalExecutoras default executor is sequentialExecutor which does not support parallelism because of sqlite.For postgresql add
sql_alchemy_conn = postgresql://postgres:[email protected]:54321/airflow_meta?sslmode=requireRun following to setup airflow
airflow db migrate airflow users create \ --username admin \ --firstname Shantanu \ --lastname Khond \ --role Admin \ --email [email protected] airflow webserver --port 8080Now your airflow webserver should be running and accessible on port 8080 Lets create service to run both web server and scheduler (For executor I will write another page to use executors)
Lets create Airflow and scheduler run script
Create Airflow webserver run script using
nano run_airflow.shAdd following code in this#bash . airflow_env/bin/activate export AIRFLOW_HOME=/app/airflow/ airflow webserverCreate Airflow scheduler run script using
nano run_scheduler.shAdd following code in this#bash . airflow_env/bin/activate export AIRFLOW_HOME=/app/airflow/ airflow scheduler
Add execute permission
chmod +x run_airflow.sh chmod +x run_scheduler.shFinally create services
First create airflow web service using following command
sudo nano /etc/systemd/system/airflow-webserver.serviceAnd add following code in it[Unit] Description = Apache Airflow Webserver Daemon After = network.target [Service] PIDFile = /app/airflow/airflow-webserver.PIDFile Environment=SUPERSET_HOME=/app/airflow Environment=PYTHONPATH=/app/airflow WorkingDirectory = /app/airflow ExecStart = sh /app/airflow/run_airflow.sh ExecStop = /bin/kill -s TERM $MAINPID [Install] WantedBy=multi-user.targetFirst create airflow web service using following command
sudo nano /etc/systemd/system/airflow-scheduler.serviceAdd following code in it[Unit] Description = Apache Airflow scheduler Daemon After = network.target [Service] PIDFile = /app/airflow/airflow-scheduler.PIDFile Environment=SUPERSET_HOME=/app/airflow Environment=PYTHONPATH=/app/airflow WorkingDirectory = /app/airflow ExecStart = sh /app/airflow/run_scheduler.sh ExecStop = /bin/kill -s TERM $MAINPID [Install] WantedBy=multi-user.target
Finally lets enable and start service
sudo systemctl daemon-reload sudo systemctl enable airflow-webserver.service sudo systemctl enable airflow-scheduler.service sudo systemctl start airflow-webserver.service sudo systemctl start airflow-scheduler.service