Autoantibody NGS sequences

So I was wondering if there exists a dataset with repertoire wide NGS data for an individual(s) with a developing and/or fully developed autoimmune disease like rheumatoid arthritis, diabetes type one, or the like. Hopefully also with sequences of some of the autoantibodies contributing to the autoimmune disease phenotype.

The idea would be to, as much as the data allows for, reconstruct the antibody lineage leading to the observed autoantibody.

I could not find any such data by searching the literature but I though that it would be likely that some of the forum users here would know.

You can look at our paper on Myasthenia Gravis that just came out:

Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing

However, this does not have the autoantibody sequences. We also have a study on West Nile virus infection that includes both repertoire-scale sequencing and WNV-specific antibodies:

Neutralizing antibodies against West Nile virus identified directly from human B cells by single-cell analysis and next generation sequencing

The first paper:

Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing

Was exactly what I was thinking about in terms of data. Of course as you are pointing out you don’t know the identity of the AChR/MuSK specific autoantibodies. But I am thinking that you could possibly fish out some clonal families relevant by using published amino acid sequences of anti AChR/MuSK autoantibodies as bait.

Do you have the processed reads available so I can check this?

Sure. Raw data is available at SRA. As part of the AIRR community, we are currently working with NCBI to generate a data deposition protocol that will include both raw and processed data (spanning SRA and GenBank), but we are just starting to test this. For now, you can just send me an email request for the processed data at:


Massive thanks Steven!

We also recently published an article on T1D that might be of use (though it is mostly T cell data, there is some IgH data there): Tissue distribution and clonal diversity of the T and B cell repertoire in type 1 diabetes - PubMed

The underlying data can be found here:

actually there might be a few other datasets of interst in our immuneAccess database (eg IgG4-RD and Wiskott-Aldrich syndrome):

I am now trying to generate some autoantibody sequences, but still working on it. I will let you know if I got some. :slight_smile:


Can you suggest ways of finding autoantibody DNA and aminoacid sequences, except for IMGT/Genbank databases. Also, what tools are there for extracting CDR3 from aminoacid sequences?

Thank you