Sponsored by the AIRR Community

Productive status with indels in FW4

I’ve noticed that using IgBlast/Changeo, I’m getting sequences annotated as functional and in-frame, which do not produce functional antibodies.

These sequences have indels in the J region FW4, so that while the V and the start of the J are in-frame, the bulk of the J is not in-frame, and the constant region will not be in-frame either.

I’m assigning these sequences are non-productive for the purposes of, “If you add a constant region vector on the end of the recovered VDJ sequence, does it make a protein?”.

I realise this is subtlety different to the in-frame junction and contains no stop codons definition used by igblast/changeo. How are other people dealing with indels in antibody sequences for classification?

Interesting… Does the INDELS column correctly denote the presence of an indel? And is the indel corrected in the SEQUENCE_IMGT column? If so, you may be able to filter for INDELS=T rows and then use the corrected sequence for those rows to express a productive antibody.

The INDELS column seems to sometimes indicate it, but not always. The SEQUENCE_IMGT column does remove additions and insert a “-” where a deletion has happened. I’ll try to pin down the cases where it’s not doing what I expect.

There is also some strange behaviour if the last couple of bases of the J are mutated.

I see in the change logs that Version 0.4.2 (September) changed the logic here, so I might update to the latest version and see if that helps.