So I was wondering if there exists a dataset with repertoire wide NGS data for an individual(s) with a developing and/or fully developed autoimmune disease like rheumatoid arthritis, diabetes type one, or the like. Hopefully also with sequences of some of the autoantibodies contributing to the autoimmune disease phenotype.
The idea would be to, as much as the data allows for, reconstruct the antibody lineage leading to the observed autoantibody.
I could not find any such data by searching the literature but I though that it would be likely that some of the forum users here would know.
However, this does not have the autoantibody sequences. We also have a study on West Nile virus infection that includes both repertoire-scale sequencing and WNV-specific antibodies:
Neutralizing antibodies against West Nile virus identified directly from human B cells by single-cell analysis and next generation sequencing
Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing
Was exactly what I was thinking about in terms of data. Of course as you are pointing out you don’t know the identity of the AChR/MuSK specific autoantibodies. But I am thinking that you could possibly fish out some clonal families relevant by using published amino acid sequences of anti AChR/MuSK autoantibodies as bait.
Do you have the processed reads available so I can check this?
Sure. Raw data is available at SRA. As part of the AIRR community, we are currently working with NCBI to generate a data deposition protocol that will include both raw and processed data (spanning SRA and GenBank), but we are just starting to test this. For now, you can just send me an email request for the processed data at: steven.kleinstein@yale.edu.
We also recently published an article on T1D that might be of use (though it is mostly T cell data, there is some IgH data there): NCBI - WWW Error Blocked Diagnostic
Can you suggest ways of finding autoantibody DNA and aminoacid sequences, except for IMGT/Genbank databases. Also, what tools are there for extracting CDR3 from aminoacid sequences?