The majority of the CDR3 sequences in my TCR data sets are unknown in literature, how to reduce this number of unknown sequences and actually find out what these sequences are?
Recently many of the sequences I search on the internet I found here:
https://github.com/smangul1/TAIR
or/and in some publication/supplementary material tables.
Which is the best organized way to make your sequences public in order to allow other people to check if the sequences in their dataset where already observed before? Are there other repositories for CDR3 sequences collection and upload?
And, in particular to know if these sequences are considered “public” or disease associated or so on? Now with VDJdb we can check for antigen specificity, but what about collecting info about sequences with unknown antigen?
Thank you for any feedback