Unified mouse Ig/TCR nomenclature

Here is an update of the mouse nomenclature subgroup (@a.collins , @ctwatson and me). The primary goal of our subgroup is to come up with a single set of nomenclature rules and formats to name the Ig/TCR segments of all mice consistently. We consider this necessary since the data of @a.collins, @kjlj and others (see this thread) show that the Igh locus of various standard mouse strains is derived from different ancestral subspecies of Mus musculus (M. m. domesticus for BALB/c and M. m. musculus for C57BL/6). This observation is in line with Nat Genet 43:648 (2011) (go [here] http://msub.csbio.unc.edu/) to browse the data). Thus we assume that it will neither be possible nor biologically accurate to map non-B6 segments as alleles to the B6 locus (which is completely mapped in the GRCm38 assembly). Since both the Johnston as well as the IMGT nomenclature are positional nomenclatures using the B6 map as reference they are also affected by this problem.

We further expect that many of the problems that we are now struggling with in mice will also be encountered sooner or later in other species, likely once we have long-range haplotype maps of them. Therefore we would like to come up with a nomenclature scheme that could also be applied more generally.

The three large decisions for a new nomenclature that we have identified by now and on which we would like to get input are:

  1. Should a new nomenclature scheme be:
    a) identical for all species (format and content, basically like IMGT)
    b) individual for each species, but always encoding the same types of information (same content, differing formats)
    c) individual for each species, without defined information
    As you know from a previous post, I have a strong preference for b), but there might be good arguments for a).

  2. Does is seem feasible to have multiple basic locus reference maps within one species (i.e. one for musculus and one for domesticus derived loci)? This might be able to rescue a positional nomenclature component, which otherwise would make little sense.

  3. Should a new nomenclature scheme be primarily based on
    a) positional numbering
    b) serial numbering
    c) phylogenetic grouping
    Which other components would you consider necessary? Examples: IMGT is mixed phylogenetic-positional, Johnston also, VBASE2 is pure serial.

There are also a lot of minor issues, but I will save those for a later posting.