Using Custom scripts
One of the key features the PAW introduces absent from Grimoirelab is the ability to run custom code as part of your data-enrichment pipeline. This section discusses how to use this feature. For detailed technical info see Multi-Mordred and Manually re-making the Custom Script Extension.
Running the Code
In order to enable backwards-compatibility of this version of Mordred with the original, all of the following settings are optional. Presumed to be disabled if absent.
Configuring the custom script task
Add the following section to your config.cfg
:
[custom_script]
scriptsource = test.py
scriptargs = [suc,cess]
scriptsource
denotes which script is to be run. Runnable scripts must be placed in the dedicated volume in order for the container to be able to find them. When running mordred non-containerized, this has to be the relative path from the current working directory.
script arguments
The optional key scriptargs
denotes a list of arguments to be given to the script. These will be transmitted as instantiated string variables arg1
, arg2
, arg3
, etc. with values corresponding to their position in the list. It is worth re-iterating that these will be variables that are magically already defined once the script is run. While deliberately modeled to feel similar to their behavior in some older languages, this does note make use of traditional command-line arguments.
For example, the script test.py
:
print("I have been run " + arg1 + arg2)
will between the enrichment and panels phases print “I have been run success”, in the console when following the examples here ver-batim.
Enabling the custom script phase
To enable the custom script phase, add custom = true
to the [phases]
section of your config. Example:
[phases]
collection = true
identities = true
enrichment = true
panels = false
custom = true
When running Grimoirelab through micro-mordred, you additionally have to provide the --custom-script
flag.
On running more complex scripts
Due to the limits of passing data from a running python program to newly interpreted scripts we recommend handing generic modules for given custom tasks to the multi-mordred container seperately, while having the script run by the customScript task merely decipher the script arguments between importing and calling the actual enrichment techniques.
ElasticWrap
To aid in creating these scripts a wrapper library for ElasticSearch was created, which should come pre-installed into the container. For details on how to use ElasticWrap see Using ElasticWrap.
Using other libraries
Because the code will be run within the Multi-Mordred container, any external libraries you want to use must be installed into the container first.
In deployment/docker/requirement.txt
simply add the required external library. Then, rebuild the multimordred image.