Evaluation datasets etc


#1

I’m on the lookout for datasets that fall into one or more of the following categories:

  • Evaluation datasets e.g. datasets from genotyped individuals, clonal datasets
  • Datasets that have comparison groups e.g. immunised vs. naive, old vs. young etc.
  • Challenging datasets e.g. repertoires from poorly characterised species

plus any other categories where advances in methodology might help. This would serve as a useful resource to benchmark and compare methods. One repo that has some evaluation human IGH data is here.


Looking for BCR datasets containing nonproductive recombinations
#2

I don’t have it broken down as such, but here’s a list of publicly available HTS repertoire data, in case it helps:

  1. Bashford-Rogers, R. J. M. et al. Network properties derived from deep sequencing of human B-cell receptor repertoires delineate B-cell populations. Genome Res. 23, 1874–84 (2013).
    ENA Accession: ERP002120

  2. Boyd, S. D. et al. Measurement and clinical monitoring of human lymphocyte clonality by massively parallel VDJ pyrosequencing. Sci. Transl. Med. 1, 12ra23 (2009).
    SRA Accession: SRP001460

  3. Collins, A. M., Wang, Y., Roskin, K. M., Marquis, C. P. & Jackson, K. J. L. The mouse antibody heavy chain repertoire is germline-focused and highly variable between inbred strains. Philos. Trans. R. Soc. B Biol. Sci. 370, 20140236 (2015).
    ENA Accession: PRJEB8745

  4. Freeman,J.D. et al. (2009) Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing. Genome Res., 19, 1817–24.
    SRA Accession: SRA008633

  5. Hoehn, K. B. et al. Dynamics of immunoglobulin sequence diversity in HIV-1 infected individuals. Philos. Trans. R. Soc. B Biol. Sci. 370, 20140241 (2015).
    ENA Accession: ERP000572

  6. Greiff, V. et al. Quantitative assessment of the robustness of next-generation sequencing of antibody variable gene repertoires from immunized mice. BMC Immunol. 15, 40 (2014).
    ENA Accession: ERP003950

  7. Jackson,K.J.L. et al. (2014) Human Responses to Influenza Vaccination Show Seroconversion Signatures and Convergent Antibody Rearrangements. Cell Host Microbe, 105–114.
    dbGaP Accession: phs000760.v1.p1

  8. Jiang, N. et al. Determinism and stochasticity during maturation of the zebrafish antibody repertoire. Proc. Natl. Acad. Sci. U. S. A. 108, 5348–53 (2011).
    SRA Accession: SRA029829

  9. Jiang,N. et al. (2013) Lineage structure of the human antibody repertoire in response to influenza vaccination. Sci. Transl. Med., 5, 171ra19.
    SRA Accession: SRA058972

  10. Michaeli, M. et al. Immunoglobulin gene repertoire diversification and selection in the stomach - from gastritis to gastric lymphomas. Front. Immunol. 5, 1–14 (2014).
    BioProject Accession: PRJNA206548

  11. Mroczek, E. S. et al. Differences in the Composition of the Human Antibody Repertoire by B Cell Subsets in the Blood. Front. Immunol. 5, 1–14 (2014).
    SRA Accession: SRP037774

  12. Ota, M. et al. Regulation of the B Cell Receptor Repertoire and Self-Reactivity by BAFF. J. Immunol. 185, 4128–4136 (2010).
    BioProject Accession: PRJNA79689

  13. Palanichamy, A. et al. Immunoglobulin class-switched B cells form an active immune axis between CNS and periphery in multiple sclerosis. Sci. Transl. Med. 6, 248ra106–248ra106 (2014).
    BioProject Accession: PRJNA248411

  14. Parameswaran,P. et al. (2013) Convergent Antibody Signatures in Human Dengue. Cell Host Microbe, 13, 691–700.
    BioProject Accession: PRJNA205206

  15. Qi,Q. et al. (2014) Diversity and clonal selection in the human T-cell repertoire. Proc. Natl. Acad. Sci. U. S. A.
    dbGap Accession: phs000787.v1.p1

  16. Stern, J. N. H. et al. B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Sci. Transl. Med. 6, 248ra107 (2014).
    BioProject Accession: PRJNA248475

  17. Tipton, C. M. et al. Diversity, cellular origin and autoreactivity of antibody-secreting cell population expansions in acute systemic lupus erythematosus. Nat. Immunol. (2015). doi:10.1038/ni.3175
    SRA Accession: SRP057017

  18. Vollmers,C. et al. (2013) Genetic measurement of memory B-cell recall using antibody repertoire sequencing. Proc. Natl. Acad. Sci. U. S. A., 110, 13463–8.
    dbGAP Accession: phs000656.v1.p1

  19. Vollmers, C., Penland, L., Kanbar, J. N. & Quake, S. R. Novel Exons and Splice Variants in the Human Antibody Heavy Chain Identified by Single Cell and Single Molecule Sequencing. PLoS One 10, e0117050 (2015).
    SRA Accession: SRP043513

  20. Wang, C. et al. High throughput sequencing reveals a complex pattern of dynamic interrelationships among human T cell subsets. Proc. Natl. Acad. Sci. U. S. A. 107, 1518–23 (2010).
    SRA Accession: SRA010149

  21. Wang,C. et al. (2014) Effects of aging, cytomegalovirus infection, and EBV infection on human B cell repertoires. J. Immunol., 192, 603–11.
    dbGAP Accession: phs000666.v1.p1

  22. Wang,C. et al. (2014) B-cell repertoire responses to varicella-zoster vaccination in human identical twins. Proc. Natl. Acad. Sci. U. S. A.
    dbGAP Accession: phs000817.v1.p1

  23. Warren,R.L. et al. (2011) Exhaustive T-cell repertoire sequencing of human peripheral blood samples reveals signatures of antigen selection and a directly measured repertoire size of at least 1 million clonotypes. Genome Res., 21, 790–7.
    SRA Accession: SRA020989

  24. Weinstein, J. A., Jiang, N., White, R. A., Fisher, D. S. & Quake, S. R. High-throughput sequencing of the zebrafish antibody repertoire. Science (80-. ). 324, 807–10 (2009).
    SRA Accession: SRA008134

  25. Wesemann, D. R. et al. Microbial colonization influences early B-lineage development in the gut lamina propria. Nature 501, 112–5 (2013).
    BioProject Accession: PRJNA212030

  26. Wu, X. et al. Focused evolution of HIV-1 neutralizing antibodies revealed by structures and deep sequencing. Science (80-. ). 333, 1593–602 (2011).
    SRA Accession: SRP006992

  27. Wu, Y.-C. B. et al. Influence of seasonal exposure to grass pollen on local and peripheral blood IgE repertoires in patients with allergic rhinitis. J. Allergy Clin. Immunol. 134, 604–612 (2014).
    SRA Accession: SRP038092

  28. Zhu, J. et al. Mining the antibodyome for HIV-1-neutralizing antibodies with next-generation sequencing and phylogenetic pairing of heavy/light chains. Proc. Natl. Acad. Sci. U. S. A. 110, 6470–5 (2013).
    SRA Accession: SRP018335

  29. Zvyagin, I. V et al. Distinctive properties of identical twins’ TCR repertoires revealed by high-throughput sequencing. Proc. Natl. Acad. Sci. U. S. A. 111, 5980–5 (2014).
    SRA Accession: SRP028752


#3

That’s great! I’ll take a look and work up some metadata to go along with these.


#4

Here are my two cents:

Metadata and processing results are here.
Another ~40 samples with “extreme” age cases are on their way to be published, hopefully I’ll be able to update the list with them soon.


#5

Thanks! Much appreciated. Probably cost more than 2c though :grin:


#6

@javh and @mikhail.shugay – thank you very much for this list. Very helpful!

Jason – have you run PRESTO on some of your list? If so, any chance you’d be willing to share your preprocessing scripts like Mike has?


#7

I’ve run a few through, but I haven’t made a serious effort. Most of them appear to be minor variations on one of these three workflows:

454
MiSeq
UMI barcoded MiSeq

I’m planning to run all the BCR data sets through at some point, and I will certainly share my pipelines when I do, but it’s more of a long-term goal. If there is a specific one you are interested in, just shoot me an email and I’ll figure out the details.


#8

As promised, a link to the complete PBMC T-cell receptor beta sequencing dataset (73 samples, some come in replicates) for our aging study: PRJNA316572.


#10
  1. Ruggiero, E. et al. High-resolution analysis of the human T-cell receptor repertoire. Nature Communications 6, 8081 (2015).
    BioProject Accession:PRJNA287162