We want to find if one CDR3 is mutated or derived from another CDR3 in our own data pool. Is it possible to do that?
Or can I search for one CDR3 sequence in our data pool with the Levenshtein Distance (the edit distance between two sequences, i.e. sum of insertions, deletions or substitutions)
check F.15.2. Levenshtein in
I did this in postgres directly.
As to if CDR3 are similar, they are mutated or derived from each other or not, I am not sure.
You can try software from our team: tcR R-package (https://cran.r-project.org/web/packages/tcR/index.html) or VDJTools (http://vdjtools-doc.readthedocs.io/en/latest/).
Thanks. Could VDJTOOLS do this? Which commands do you mean?
I am checking with tcR now. I have not found how to do that with tcR too but this software is useful.
in tcR, vis.gene.usage(twb, HUMAN_TRBJ, .main = ‘twb J-usage dodge’, .dodge = T) is based on “Read.count”, can I use “Read.proportion” to plot this figure?
Try .quant option. Like this: vis.gene.usage(twb, HUMAN_TRBJ, .main = ‘twb J-usage dodge’, .dodge = T, .quant=‘read.prop’)
The full list of parameters for vis.gen.usage is described in the main function called geneUsage.
Got you! That works! thanks.
Actually, for this particular task (searching for clonotypes with hamming/levensthein distance) I’m using tcR.
As for VDJTools, it seems I was wrong. Just briefly went through its parameters - looks like the tool can’t search with not exact matching. May be, it would be better to ask Michael (mikhail.shugay) directly…
It’s also worth remembering that just because a CDR3 is very similar to another one it doesn’t mean that one was derived from the other. Even TCR repertoires (which will not be undergoing SHM) will display vast networks of CDR3s which differ from each other by a Levenshtein distance of 1, presumably reflecting the biases inherent in their generation and selection.
You are right. I think there are still no answers for the judge of mutations. How to define if it’s mutation or derivation of a TCR.