Standardizing the format of a germline set

martin_corcoran · July 23, 2016, 9:21am

Here’s a schematic regarding Ig content in different species, from a 2014 review by Rita Pettinello and Helen Dooley in ‘Biomolecules’.

Clearly the H/K/L designations will not suffice if we are aiming for an inclusive system.
I personally prefer a single fasta file for each gene type (one fasta for all IGHV, one for all IGKV etc) since it makes it easier to port to other analysis tools that allow you to upload custom database files.