Productive status with indels in FW4


#1

I’ve noticed that using IgBlast/Changeo, I’m getting sequences annotated as functional and in-frame, which do not produce functional antibodies.

These sequences have indels in the J region FW4, so that while the V and the start of the J are in-frame, the bulk of the J is not in-frame, and the constant region will not be in-frame either.

I’m assigning these sequences are non-productive for the purposes of, “If you add a constant region vector on the end of the recovered VDJ sequence, does it make a protein?”.

I realise this is subtlety different to the in-frame junction and contains no stop codons definition used by igblast/changeo. How are other people dealing with indels in antibody sequences for classification?


#2

Interesting… Does the INDELS column correctly denote the presence of an indel? And is the indel corrected in the SEQUENCE_IMGT column? If so, you may be able to filter for INDELS=T rows and then use the corrected sequence for those rows to express a productive antibody.


#3

The INDELS column seems to sometimes indicate it, but not always. The SEQUENCE_IMGT column does remove additions and insert a “-” where a deletion has happened. I’ll try to pin down the cases where it’s not doing what I expect.

There is also some strange behaviour if the last couple of bases of the J are mutated.

I see in the change logs that Version 0.4.2 (September) changed the logic here, so I might update to the latest version and see if that helps.