I cannot agree with the “isn’t needed for 99% of rep-seq applications” statement, I’d rather say “isn’t needed for 99% of current rep-seq applications”.
Assuming one nucleotide TCRB sequence is one clone is a good approximation. On the other hand why do we use nucleotide sequences here? Its because different nucleotide assemblies are likely to have different TCR alpha chains.
This is a basic diversity metric, good for counting the number of naive cells, etc. But the real diversity that matters is in the amino acid sequences. For example you can have 10 CDR3beta nucleotide sequences encoding for a single CDR3beta amino acid sequence - this is far less diverse than 10 distinct CDR3beta AA sequences.
If we define the diversity as the number of antigens a given repertoire can efficiently cover, the things get very complex and the amino acid composition of both TCR alpha and TCR beta are needed. For example CDR3AA different by a single amino acid substitution are likely to have an overlapping antigen recognition profile.
There are cases when a certain TRA is marking an important T-cell subset, e.g. iNKT cells. You cannot distinguish these cells by simply sequencing TRB. Of course you can do some FACS analysis, but as iNKT is a rare population the contamination will become a problem.
Role of TRA and TRB in antigen recognition
The average number of recognized antigen residues is comparable for TRA and TRB:
So describing tumor-specific TCRs, etc is added to the list