I assume you've already gone through the install and configure instructions. Let's initialize a bulker config file for this tutorial:
rm "bulker_config.yaml" export BULKERCFG="bulker_config.yaml" bulker init -c $BULKERCFG
rm: cannot remove 'bulker_config.yaml': No such file or directory Guessing container engine is docker. Wrote new configuration file: bulker_config.yaml
Let's start with a few terms:
crate. A collection of containerized executables. A crate is analogous to a docker image (but it provides multiple commands by pointing to multiple images).
manifest. A manifest defines a crate. It is a list of commands and images to be included in the crate. A manifest is analogous to a Dockerfile. It could be thought of as a Cratefile.
load. Loading a manifest will create a local folder with executables for each command in the manifest. Loading a manifest is analogous to building or pulling an image.
activate. Activating a crate is what allows you to run the commands in a crate. Activating is analogous to starting a container. Any loaded crates are available to activate. Activating a crate does nothing more than prepend the crate folder to your
The executables are created upon
load, and merely added to your PATH upon
activate. Therefore, if you adjust settings in your bulker config (such as VOLUMES or ENVVARS), you'll need to re-load the manifests to update the actual bulker containerized executables. As of version
0.7.0, you can do this easily with
bulker reload upon a config file change.
I assume you've followed the instructions to install and configure bulker. Next, type
bulker list to see what crates you have available. If you've not loaded anything, it should be empty:
Bulker config: /home/nsheff/code/bulker/docs_jupyter/bulker_config.yaml Available crates: No crates available. Use 'bulker load' to load a crate.
Let's load a demo crate. There are a few ways to load a manifest: either from a bulker registry, or directly from a file.
Using a bulker registry
Here's a manifest that describes 2 commands:
manifest: name: demo version: 1.0.0 commands: - command: cowsay docker_image: nsheff/cowsay docker_command: cowsay docker_args: "-i" - command: fortune docker_image: nsheff/fortune docker_command: fortune
This manifest is located in the bulker registry, under the name
bulker/demo. Here 'bulker' is the namespace (think of it as the group name) and 'demo' is the name of the crate to load. Since 'bulker' is the default namespace, you can load it like this:
bulker load demo
Bulker config: /home/nsheff/code/bulker/docs_jupyter/bulker_config.yaml Loading manifest: 'bulker/demo:default'. Activate with 'bulker activate bulker/demo:default'. Commands available: cowsay, fortune
bulker load bulker/demo:default would do the same thing. That's how you load any crate, from any namespace, from the registry.
Loading crates from a file
You can also load any manifest by pointing to the yaml file with the
bulker load demo -f http://big.databio.org/bulker/bulker/demo.yaml
Here, the registry path ('demo') indicates to bulker what you want to name this crate. You can name it whatever you want, since you're loading it directly from a file and not from the registry...so you can do
bulker load myspace/mycrate -f /path/to/file.yaml.
Once you've loaded a crate, if you type
bulker list you should see the
demo crate available for activation. But first, let's point out the
-b argument, which you can pass to
bulker load. By default, all
bulker load does is create a folder of executables. It does not actually pull or build any images. Docker will automatically pull these by default as soon as you use them, which is nice, but you might rather just grab them all now instead of waiting for that. In this case, just pass
-b to your
bulker load command:
bulker load demo -b -f
Bulker config: /home/nsheff/code/bulker/docs_jupyter/bulker_config.yaml Building images with template: /home/nsheff/code/bulker/docs_jupyter/templates/docker_build.jinja2 Removing all executables in: /home/nsheff/bulker_crates/bulker/demo/default Using default tag: latest latest: Pulling from nsheff/cowsay Digest: sha256:14fa1f533678750afd09536872e068e732ae4f735c52473450495d5af760c2e3 Status: Image is up to date for nsheff/cowsay:latest docker.io/nsheff/cowsay:latest Docker image available as: nsheff/cowsay Using default tag: latest latest: Pulling from nsheff/fortune Digest: sha256:a980b4b333a8b89acf4c2fe90dde5da93898ab574a6d2e88152398724667957b Status: Image is up to date for nsheff/fortune:latest docker.io/nsheff/fortune:latest Docker image available as: nsheff/fortune Loading manifest: 'bulker/demo:default'. Activate with 'bulker activate bulker/demo:default'. Commands available: cowsay, fortune
Now, bulker will instruct docker (or singularity) to pull all the images required for all the executables in this crate. (The
-f just forces an overwrite without prompting). Now we can see it in our available local crates:
Bulker config: /home/nsheff/code/bulker/docs_jupyter/bulker_config.yaml Available crates: bulker/demo:default -- /home/nsheff/bulker_crates/bulker/demo/default
Updating crates with config changes
One final point. Let's say you make a change to your configuration file, say, by adding a new volume or environment variable. Now, you'd like this change to be reflected in your activated crate. But just typing
bulker activate ... after changing the config file will not be enough, because the config changes were read and populated at the
bulker load stage.
Therefore, when you update your config file, you'll need to re-load any affected crates. Bulker makes this easy with a
bulker reload command.
Running commands using bulker crates
Once you have loaded a crate, all it means is there's a folder somewhere on your computer with a bunch of executables. You can use it like that if you like, by just running these commands directly. For example, the demo crate by default will create the following path: '$HOME/bulker_crates/bulker/demo/default/cowsay'. You can execute this by including the full path:
_____ < boo > ----- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
This example demonstrates how simple and flexible bulker is under the hood. But using commands like this is cumbersome. It simplifies things if you add these commands to your
PATH, plus, then you can more easily use sets of commands as a kind of controlled computational environment. Bulker provides two ways to do this conveniently, depending on your use case:
bulker activate, and
activate. This will add all commands from a given crate to your PATH and give you a terminal where you can use them. You want to use activate if you want to manage crates like namespaces that you can turn on or off. This is useful for controlling which software versions are used for which tasks, because the manifest controls the versions of software included in a crate.
run. This will run a single command in a new environment that has a crate prepended to the PATH.
Try it out with this command:
First, we'll activate the new environment. On your command line, you just need to type
bulker activate demo. In Jupyter, since bulker provides a new shell, we have to use this
eval workaround, but you can ignore the complexity if you're not using jupyter:
eval $(bulker activate demo -p -e)
Bulker config: /home/nsheff/Dropbox/env/bulker_config/zither.yaml Activating bulker crate: demo
We can now use
bulker inspect, which will show us that we've activated the
bulker/demo manifest and have
Bulker config: /home/nsheff/Dropbox/env/bulker_config/zither.yaml Crate path: /home/nsheff/bulker_crates/bulker/demo/default Bulker manifest: bulker/demo Available commands: ['cowsay', 'fortune']
Here's where the magic happens: let's pipe the output of
fortune | cowsay
________________________________________ / The true Southern watermelon is a boon \ | apart, and not to be mentioned with | | commoner things. It is chief of the | | world's luxuries, king by the grace of | | God over all the fruits of the earth. | | When one has tasted it, he knows what | | the angels eat. It was not a Southern | | watermelon that Eve took; we know it | | because she repented. | | | | -- Mark Twain, "Pudd'nhead Wilson's | \ Calendar" / ---------------------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
The advantage of bulker over vanilla containers
On the surface, this seems the same as running this command in a container that includes both fortune and cowsay. Indeed, the user experience is pretty similar. What separates this process from a typical container use is that our command is actually not running in a container, but in the host shell, and using two commands that each run in separate containers. There is no container that contains both
cowsay; instead, we have individual containers for each command, and then wrapped each command in an executable. Both of these commands are in our PATH because they're both included in the crate.
Activating multiple crates
You can also pass a comma-separated list of crates to either
activate, which will merge executables from two different crates. This is not practical using vanilla containers because it requires you to build a new container that contained the software from both containers, which would eliminate the advantages of modularity and increase container bloat and disk use.
As an example, let's load another demo crate that adds a new command
pi, which prints out
pi to many digits. We can get our cow to quote these pi definitions by activating both of these crates.
bulker load pi -b -f
Bulker config: /home/nsheff/code/bulker/docs_jupyter/bulker_config.yaml Building images with template: /home/nsheff/code/bulker/docs_jupyter/templates/docker_build.jinja2 Using default tag: latest latest: Pulling from nsheff/pi Digest: sha256:6187416a85719fb42bcd4e4c62ffce3b5757c2d17813090cadbd9f4eeb9c9425 Status: Image is up to date for nsheff/pi:latest docker.io/nsheff/pi:latest Docker image available as: nsheff/pi Loading manifest: 'bulker/pi:default'. Activate with 'bulker activate bulker/pi:default'. Commands available: pi
Now try running a command that requires commands from two different crates:
Again, you can use
bulker activate pi,demo if in a shell, or use this
eval code in a jupyter notebook:
eval $(bulker activate pi,demo -p -e)
Bulker config: /home/nsheff/Dropbox/env/bulker_config/zither.yaml Activating bulker crate: pi,demo
pi | cowsay
_________________________________________ / 3.1415926535897932384626433832795028841 \ | 971693993751058209749445923078164062862 | \ 08998628034825342117067 / ----------------------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
So, outside of jupyter you'd write:
bulker activate pi,demo pi | cowsay
Just to make sure you realize what's happening here and why this is so cool: this is not a command running in a single container. In fact, the command itself is running in the host shell, and the pipe (
|) is handled by the host shell. The two executables,
cowsay, are each being run within their own modular containers that do only one thing. And, each of these commands are located in different crates, which are activated simultaneously.
That's basically it. If you're a workflow developer, all you need to do is write your own manifest and distribute it with your workflow; in 3 lines of code, users will be able to run your workflow using modular containers, using the container engine of their choice.