In an online discussion, @w.lees suggested that it would be worth enumerating still-unsolved challenges in repertoire sequence analysis. Perhaps following the mold of previous such posts, make suggestions as replies and I’ll add to the list here:
Practical issues
- Make current tools more flexible to germline sets.
Computational challenges
- Personal germline gene databases: how to infer a personal collection of germline genes from a repertoire sample: this includes adding genes are not in germline gene databases, and restricting to the subset of germline genes in an individual.
- Clonal family inference, a.k.a. finding clones in repertoires.
- Specialized phylogenetic inference tools for BCRs. This includes the following two settings: first, sequences from the peripheral repertoire for which we don’t have especially dense sampling of a given lineage, and second, very dense sampling of a given lineage, say extracted from a single germinal center.