Each T cell receptor (TCR) gene is created without regard for which substances (antigens) the receptor can recognize. T cell selection removes developing T cells when their TCRs (i) fail to recognize major histocompatibility complexes (MHCs) that act as antigen presenting platforms or (ii) recognize with high affinity self-antigens derived from healthy cells and tissue. While T cell selection has been thoroughly studied, little is known about which TCRs are retained or removed by this process. Therefore, we develop an approach using TCR gene sequencing and machine learning to identify patterns in TCR protein sequences influencing the outcome of T cell receptor selection.
We train machine learning models to predict that repaired, non-productive TCR protein sequences are found before T cell selection and productive TCR protein sequences are found after T cell selection. We then verify the trained models classify TCRs from developing T cells as being before selection and TCRs from mature T cells as being after selection. Our approach may provide future avenues for studying the relationship between T cell selection and conditions like autoimmune diseases.
Disclosure: The research in this manuscript is protected by a provisional patent from UT Southwestern
Journal Link: https://www.nature.com/articles/s41435-021-00141-9
Open Access Link: https://rdcu.be/cmt4E