Extensions to the Build Process
We have provided an example docker setup to quickly build the paw. For more info on simply building and using it, see Getting Started. This document will instead focus on possible extensions to the build process that may be wanted later.
Production Environment
While our version of the deployment script is a quick way to get a development version up and running, you will likely want to have a production version at some point in the future. In order to create a proper production version, you should be aware of the following concepts:
Production ElasticSearch
In order to set up ElasticSearch quickly, we set a bunch of environment variables which make the instance insecure and ineffecient. For a production version, you are best off following the Official ElasticSearch docs.
Production SortingHat
The GrimoireLab stack uses a system called sortinghat
for data enrichment purposes (mostly focused on identity management). At time of writing, sortinghat
has just been converted to a new service-based architecture which is rather undocumented. Some tips for creating a production version:
SortingHat uses Django under the hood. Many Django settings can be set using environment variables if you prepend the key with
SORTINGHAT
.SortingHat has an internal HTTP server that runs on port 8000 which we enable by setting
DEV_FLAG=1
. In a production environment, you would probably want to spin up a seperate docker image that runsnginx
/apache
/some other webserver instead. If you do, remove theDEV_FLAG
environment variable and point the proxy webserver to porthttp://sortinghat:9314
. You can remove the exposed port8000
, as it is no longer used.
Changes in Dependencies
The python dependencies for the multi-mordred container are installed using the deployment/docker/requirements.txt
file. If you want to add a new python dependency (for example for a custom script), you can simply add it here. If you want to use a fork of a grimoirelab tool (for example, a fork of Perceval because new retrievers were added), you can simply change the line getting that tool. Inside of the requirements.txt
you have access to the ${GITHUB_TOKEN}
and ${GITHUB_USER}
environment variables in order to install from a private repository.
Updating MultiMordred
MultiMordred is different in that it is not a tool or library, but is instead the script that the docker image runs. During building, the script is retrieved from github and downloaded into the container, which then runs it. If you want to update the multimordred script, you will have to rebuild the multimordred container. Generally, this can be achieved using docker-compose build --no-cache
. However, sometimes Docker decides to cache the files it retrieves from github, even if you use the --no-cache
flag. You can be sure that it will grab the latest version by first running docker system prune
, which will remove all unused docker containers on your system and also flush the docker cache.