I assume you've already gone through the install and configure instructions. Let's initialize a bulker config file for this tutorial:
rm "bulker_config.yaml" export BULKERCFG="bulker_config.yaml" bulker init -c $BULKERCFG
Guessing container engine is docker. Wrote new configuration file: bulker_config.yaml
Let's start with a few terms:
crate. A collection of containerized executables. A crate is analogous to a docker image (but it provides multiple commands by pointing to multiple images).
manifest. A manifest defines a crate. It is a list of commands and images to be included in the crate. A manifest is analogous to a Dockerfile. It could be thought of as a Cratefile.
load. Loading a manifest will create a local folder with executables for each command in the manifest. Loading a manifest is analogous to building or pulling an image.
activate. Activating a crate is what allows you to run the commands in a crate. Activating is analogous to starting a container. Any loaded crates are available to activate. Activating a crate does nothing more than prepend the crate folder to your
I assume you've followed the instructions to install and configure bulker. Next, type
bulker list to see what crates you have available. If you've not loaded anything, it should be empty:
Bulker config: bulker_config.yaml Available crates: No crates available. Use 'bulker load' to load a crate.
Let's load a demo crate. There are a few ways to load a manifest: either from a bulker registry, or directly from a file.
Using a bulker registry
Here's a manifest that describes 2 commands:
manifest: name: demo version: 1.0.0 commands: - command: cowsay docker_image: nsheff/cowsay docker_command: cowsay docker_args: "-i" - command: fortune docker_image: nsheff/fortune docker_command: fortune
This manifest is located in the bulker registry, under the name
bulker/demo. Here 'bulker' is the namespace (think of it as the group name) and 'demo' is the name of the crate to load. Since 'bulker' is the default namespace, you can load it like this:
bulker load demo
Bulker config: bulker_config.yaml Got URL: http://big.databio.org/bulker/bulker/demo.yaml Executable template: /home/nsheff/.local/lib/python3.5/site-packages/bulker/templates/docker_executable.jinja2 Loading manifest: 'demo'. Activate with 'bulker activate demo'. Commands available: cowsay, fortune
bulker load bulker/demo:default would do the same thing. That's how you load any crate, from any namespace, from the registry.
Loading crates from a file
You can also load any manifest by pointing to the yaml file with the
bulker load demo -f http://big.databio.org/bulker/bulker/demo.yaml
Here, the registry path ('demo') indicates to bulker what you want to name this crate. You can name it whatever you want, since you're loading it directly from a file and not from the registry...so you can do
bulker load myspace/mycrate -f /path/to/file.yaml.
Once you've loaded a crate, if you type
bulker list you should see the
demo crate available for activation. But first, let's point out the
-b argument, which you can pass to
bulker load. By default, all
bulker load does is create a folder of executables. It does not actually pull or build any images. Docker will automatically pull these by default as soon as you use them, which is nice, but you might rather just grab them all now instead of waiting for that. In this case, just pass
-b to your
bulker load command:
bulker load demo -b -r
Bulker config: bulker_config.yaml Got URL: http://big.databio.org/bulker/bulker/demo.yaml Executable template: /home/nsheff/.local/lib/python3.5/site-packages/bulker/templates/docker_executable.jinja2 Building images with template: /home/nsheff/.local/lib/python3.5/site-packages/bulker/templates/docker_build.jinja2 Removing all executables in: /home/nsheff/bulker_crates/bulker/demo/default Using default tag: latest latest: Pulling from nsheff/cowsay Digest: sha256:14fa1f533678750afd09536872e068e732ae4f735c52473450495d5af760c2e3 Status: Image is up to date for nsheff/cowsay:latest Container available at: /home/nsheff/simages/nsheff/cowsay Using default tag: latest latest: Pulling from nsheff/fortune Digest: sha256:a980b4b333a8b89acf4c2fe90dde5da93898ab574a6d2e88152398724667957b Status: Image is up to date for nsheff/fortune:latest Container available at: /home/nsheff/simages/nsheff/fortune Loading manifest: 'demo'. Activate with 'bulker activate demo'. Commands available: cowsay, fortune
Now, bulker will instruct docker (or singularity) to pull all the images required for all the executables in this crate. (The
-r just forces an overwrite without prompting). Now we can see it in our available local crates:
Bulker config: bulker_config.yaml Available crates: bulker/demo:default -- /home/nsheff/bulker_crates/bulker/demo/default
Running commands using bulker crates
Once you have loaded a crate, all it means is there's a folder somewhere on your computer with a bunch of executables. You can use it like that if you like, by just running these commands directly. For example, the demo crate by default will create the following path: '$HOME/bulker_crates/bulker/demo/default/cowsay'. You can execute this by including the full path:
_____ < boo > ----- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
This example demonstrates how simple and flexible bulker is under the hood. But using commands like this is cumbersome. It simplifies things if you add these commands to your
PATH, plus, then you can more easily use sets of commands as a kind of controlled computational environment. Bulker provides two ways to do this conveniently, depending on your use case:
bulker activate, and
activate. This will add all commands from a given crate to your PATH and give you a terminal where you can use them. You want to use activate if you want to manage crates like namespaces that you can turn on or off. This is useful for controlling which software versions are used for which tasks, because the manifest controls the versions of software included in a crate.
run. This will run a single command in a new environment that has a crate prepended to the PATH.
Try it out with this command:
bulker run demo "fortune | cowsay"
Bulker config: bulker_config.yaml Activating crate: demo _________________________________________ / You are deeply attached to your friends \ \ and acquaintances. / ----------------------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
The advantage of bulker over vanilla containers
On the surface, this seems the same as running this command in a container that includes both fortune and cowsay. Indeed, the user experience is pretty similar. What separates this process from a typical container use is that our command is actually not running in a container, but in the host shell, and using two commands that each run in separate containers. There is no container that contains both
cowsay; instead, we have individual containers for each command, and then wrapped each command in an executable. Both of these commands are in our PATH because they're both included in the crate.
Activating multiple crates
You can also pass a comma-separated list of crates to either
activate, which will merge executables from two different crates. This is not practical using vanilla containers because it requires you to build a new container that contained the software from both containers, which would eliminate the advantages of modularity and increase container bloat and disk use.
As an example, let's load another demo crate that adds a new command
pi, which prints out
pi to many digits. We can get our cow to quote these pi definitions by activating both of these crates.
bulker load pi -b -r
Bulker config: bulker_config.yaml Got URL: http://big.databio.org/bulker/bulker/pi.yaml Executable template: /home/nsheff/.local/lib/python3.5/site-packages/bulker/templates/docker_executable.jinja2 Building images with template: /home/nsheff/.local/lib/python3.5/site-packages/bulker/templates/docker_build.jinja2 Using default tag: latest latest: Pulling from nsheff/pi Digest: sha256:6187416a85719fb42bcd4e4c62ffce3b5757c2d17813090cadbd9f4eeb9c9425 Status: Image is up to date for nsheff/pi:latest Container available at: /home/nsheff/simages/nsheff/pi Loading manifest: 'pi'. Activate with 'bulker activate pi'. Commands available: pi
Now try running a command that requires commands from two different crates:
bulker run pi,demo "pi | cowsay"
Bulker config: bulker_config.yaml Activating crate: pi,demo _________________________________________ / 3.1415926535897932384626433832795028841 \ | 971693993751058209749445923078164062862 | \ 08998628034825342117067 / ----------------------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
You can get the same result using the
bulker activate pi,demo pi | cowsay
Just to make sure you realize what's happening here and why this is so cool: this is not a command running in a single container. In fact, the command itself is running in the host shell, and the pipe (
|) is handled by the host shell. The two executables,
cowsay, are each being run within their own modular containers that do only one thing. And, each of these commands are located in different crates, which are activated simultaneously.
That's basically it. If you're a workflow developer, all you need to do is write your own manifest and distribute it with your workflow; in 3 lines of code, users will be able to run your workflow using modular containers, using the container engine of their choice.