Sounds good.
For folks not familiar with CWL, there’s a gentle introduction here. The idea is that one writes a .cwl
file that describes how to make a command line string from a list of parameters. One then specifies this list of parameters in another JSON or YAML file like so:
example_flag: true
example_string: hello
example_int: 42
example_file:
class: File
path: whale.txt
This is a great framework for formalizing things and then running them, but there are two things that still require some decision-making.
- As before, what do we want to have as shared parameters for various tools?
- Do we want to get behind some container-based framework for encapsulating tools and running them? This would sure save a lot of time on the part of folks who want to run the tools by avoiding complex dependencies.
For #1 in the case of VDJ annotation as in my original post,
It appears to me that CWL nicely handles optional parameters:
When the parameter type ends with a question mark ? it indicates that the parameter is optional.
So with this modification for an optional alignment, perhaps we are in good shape for #1 for VDJ alignment tools? I would, however, suggest something else than NEXUS, which is a pretty wild format itself. We also need to agree on output, and it seems like the Change-O data standard is for the time being the way to go.
For #2, do we have some votes for Rabix over Bioboxes? Something else? Or does it matter?