pip install virtualenv
skip if ansible is already installed
pip install ansible
Below command source into virtualenv (which has ansible installed)
./pre-deploy.sh [-f] [-e dev]
source venv/bin/activate
-fwill force delete the old venv and create a new one, or it will raise error if venv already exists-eis used for custom pypi server in different environments. If there's no blocking for downloading ansible from official pypi server, and there's no need to add this argument.
- Copy
<service>/<service>-vars/<service>-dev.ymlto<service>/<service>-vars/<service>-<env>.yml,<env>might be the env name where you deploy service - within
<service>/<service>-vars/<service>-<env>.yml- change value of
base_path,check Service Structure for recommendedbase_pathlocation - change value of
envto your<env>name - change other variables if needed
- change value of
The script will download repos from github, make sure the following command works
github (dev)
ssh -T git@github.com
If deploying airflow with --install-from-source flag, make sure that there's npm installed in your environment.
- The deployment script use getopts which follows gnu-getopt. To let it works in MacOS, we need to install
gnu-getopt.
brew install gnu-getopt
echo 'export PATH="/usr/local/opt/gnu-getopt/bin:$PATH"' >> ~/.bash_profile
tartype isbsd-tarin MacOS. To let ansible works with unarchive postresql and redis, we need to installgnu-tar.
If only deploy airflow, no need for this step
brew install gnu-tar
echo 'export PATH="/usr/local/opt/gnu-tar/libexec/gnubin:$PATH"' >> ~/.bash_profile
Notice:
<env>is your env name
skip to Set Database - postgresql and Set Varieble - Postgresql if postgresql is already installed
build
postgresqlfrom source if you don't have root privilege.
Before deployment, download source code from postgresql. The location of downloaded file is described in service structure.
./deploy.sh -s postgresql -e <env>
postgresql is automatically started after running
./deploy.sh ..., also commands in Set Database - postgresql are executed
- start postgresql manually
./bin/pg_ctl -D ./data/ -l logfile start
- stop postgresql manually
./bin/pg_ctl -D ./data/ stop
If testing with
devenvironment, it's easier to install from binary and you just need to create db, user and grant permission manually.
create user and database for airflow
postgres=# create database <db>;
postgres=# create user <user> with encrypted password '<pwd>';
postgres=# grant all privileges on database <db> to <user>;
Check if backend_db in airflow/airflow-vars/airflow-<env>.yml is set to correct location. e.g., postgresql://<user>:<pwd>@0.0.0.0:5432/<db>
skip to Set varieble - redis if redis is already installed.
Before deployment, download file from redis. The location of downloaded file is described in service structure.
./deploy.sh -s redis -e <env>
redis server is automatically started in the last step of deployment.
- start redis server manually
src/redis-server > ../redis.log 2>&1 &
- stop redis server manually
src/redis-cli shutdown
Check if broker_url in airflow/airflow-vars/airflow-<env>.yml is set to correct location. e.g., redis://0.0.0.0:6379/0
Before running below command, make sure that
- Backend database (postgresql) is running.
- There's
ansibleinstalled (global or within virtualenv)
./deploy.sh -s airflow -e <env> [--keep-db] [--keep-venv] [--install-from-source]
--keep-dbmodifysql_alchemy_conninairflow.cfgto change backend db from sqlite(default) to postgresql without reseting db, data such as connections, variables, and pools ... will not be deleted. Do not use this argument if backend database is already empty orairflow.cfgis not exist.--keep-venvspeeds up the deployment process without removing existed venv. However, if there're new version of libraries, don't use this argument since it may not upgrade those libraries.--install-from-sourceclone repository fromairflow_repoand build instead of trying to download from pypi server. Force reinstallapache-airflowwhen it's used with--keep-venv.- airflow deployment includes writing config file to
var/airflow-deployment-conf.sh, it is used for scripts to read and control the services
use -h to show usage function
- control airflow from scripts, process running in backgroud
# start | stop | restart | status
scripts/airflow-webserver.sh [start|stop|restart|status]
scripts/airflow-scheduler.sh [start|stop|restart|status]
scripts/airflow-worker.sh [start|stop|restart|status]
scripts/airflow-flower.sh [start|stop|restart|status]- [Debug] start service manually, process running in foreground
Notification:<airflow_venv>path is different from venv created bypre-deploy.sh
# find airflow venv create by `deploy.sh`
''' Check `airflow_venv` in airflow/airflow-vars/airflow-<env>.yml '''
source <airflow_venv>/bin/activate
# [IMPORTANT!] export AIRFLOW_HOME
''' Check `airflow_home` in airflow/airflow-vars/airflow-<env>.yml '''
export AIRFLOW_HOME=<airflow_home>
# start airflow webserver
airflow webserver
# start airflow scheduler
airflow scheduler
# start airflow worker(celery)
airflow worker
# start ariflow flower
airflow flower- airflow/postgresql/redis service structure
<base_path>
└───airflow-app <- path can be set by `airflow_home` variable
│ │ airflow.cfg
│ │ unittests.cfg
│ │
│ └───dags
│ └───airflow-maintainence-dags
│ └───adw
│ └───tags
│ └───...
│ │
│ └───plugins
│ └───event_plugins
│ └───...
│ │
└───logs <- path can be set by `airflow_log` variable
│ └───dags
│ └───webserver
│ └───scheduler
│ └───worker
│ └───flower
│
└───venv <-- path can be set by `airflow_venv` varaible
└───postgresql
│ │ postgresql-<pg_version>.tar.gz <-- prepare this file first
│ └───postgresql-<pg_version>
│ └───pgsql
└───redis
│ │ redis-<redis_version>.tar.gz <-- prepare this file first
│ └───redis-<redis_version>
│
└───airflow-operation <-- put this repo here
│ └───var
│ │ │ airflow-deployment-conf.sh <-- record the variables after airflow deployment
└───airflow-sourcecode (opt)
└───airflow-plugins
- airflow log rotation
airflow scheduler -Dnot work with LocalExecutor|CeleryExecutor in osx 10.14.5
