Immunarch will significantly evolve, but it will break things – and we need your help

vadimnazarov · October 12, 2023, 1:59pm

Hi b-t.cr forum,

It is Vadim Nazarov, the main author of Immunarch (https://immunarch.com/), previously tcR. In this post, I want to share our plans for the upcoming v1.0.0 release that we (finally!) plan to publish – and ask for your help. It will bring valuable enhancements to our community of immune repertoire researchers, but also break some things:

Things will be broken – focus on AIRR community data format: Some pipelines will be broken, and we will smoothen the transition as much as possible. The days are over for unique data format per tool because the ecosystem is mature enough to have its foundation. The AIRR Community did a fantastic job in creating a standardized data format for AIRR data, and Immunarch will use it in its core. The transition will be painful at first, but in order to ensure future stability and robustness of both the tool and the whole ecosystem, we need to do it.
Paired-end TCR and BCR analysis: Immunarch will offer full support for paired-end TCR and BCR analysis, allowing you to explore your data more comprehensively and with any clonotype model you have in mind, e.g., CDR3-only, CDR3+V, CDR3 from both chains, etc.
Single-cell AIRR analysis and data integration: We’re expanding our capabilities to provide full support for single-cell AIRR analysis and seamless integration of per-receptor data from various modalities, e.g., transcriptomics, immunogenicity, TCR generation probabilities, clinical data, time points, etc.
Make Immunarch the single interface for any AIRR-related analysis: Provide an interface to widely used tools for additional receptor data modalities: generation probability tools (IGOR, OLGA), immunogenicity prediction, MHC binding prediction, etc.
Enhanced single-cell immunogenomics visualization: Improved tools for analyzing and visualizing the data, making it easier to derive insights from combined immune receptor and multi-modal datasets.
Support for out-of-memory data: To handle extra-large datasets, Immunarch will introduce support for out-of-memory datasets to ensure smooth operation even with substantial data volumes. We plan to employ Apache Arrow and DuckDB, and allow extension to novel tools through a common interface.
Software modularity and support for both R and Python: We will split immunarch into smaller software tools with limited scopes, and we will start with two: one will be focused on data, another one – on analytics. It will allow greater extensibility and interoperability with other tools. We will add initial support for Python for the data tool to allow developers and data engineers to make efficient computational platforms using Immunarch data structures as the backend.

We have an implementation plan and first prototypes, but we want to talk to you first to:

validate the plan,
understand, how we can solve these problems more efficiently, and
what we should prioritize.

If you’re working on similar problems or have ideas on how to further improve Immunarch or your single-cell immunomics routine, we welcome discussions and interviews to explore the best ways to implement these features. This is also the best way to get support and discuss your current challenges.

Please contact me through the email to schedule a call: vadim.talk@immunomind.com

Thank you!

Vadim Nazarov

vadimnazarov · October 15, 2025, 9:55am

immunarch 0.10.3 — Release Notes

Highlights

Single-cell support. Load single-cell AIRR data, pair chains, and link scRNA-seq or spatial metadata.
Repertoire statistics. Key functions for gene usage, diversity, clonality, and publicness are implemented. More will come.
Out-of-memory + rich annotation. Work with data bigger than RAM. Annotate receptors with any info (e.g., immunogenicity, gene expression, cluster). Powered by ImmunData from immundata package.
Faster at scale. Major speed-ups for large cohorts and single-cell libraries thanks to DuckDB backend.
More reproducible. Immutable data objects reduce side effects and make analyses repeatable.
Easier to install. Multiple packages have been moved to “Suggests” and will be removed or moved to other packages later, making immunarch much more convenient to install.

Migration notes (0.9 → 0.10)

New data layer: ImmunData from immundata. Functions return new objects instead of mutating in place. This helps reproducibility and scaling.
API changes: large “meta-functions” like repDiversity() split into small families such as airr_diversity_*. Use the shared prefixes and the family help pages (e.g., ?airr_diversity).
Fewer heavy dependencies; faster install.
Visualisation with vis() for the new API is still evolving. New vis() is focused on fast and convenient visualisations rather than publication-ready figures. For publication-ready plots, use ggplot2.

Tutorials & How-tos

Single-cell tutorial (end-to-end) — loading, QC, gene usage, clonality, diversity, public indices, annotation. [immunomind.github.io]
Import old immunarch data — convert repLoad() to ImmunData with from_immunarch(...). [immunomind.github.io]
Read AIRR-C (bulk, single-cell), 10x Genomics (single-chain, paired-chain) — how-tos are available on the docs website. [immunomind.github.io]

New / reworked analysis families (API reference)

These are the main v1.0-style families used in 0.10.3. Open the help in R (?family_name) or the API pages below.

airr_stats:            https://immunarch.com/reference/airr_stats.html
airr_public:           https://immunarch.com/reference/airr_public.html
airr_clonality:        https://immunarch.com/reference/airr_clonality.html
airr_diversity:        https://immunarch.com/reference/airr_diversity.html
annotate_clonality:    https://immunarch.com/reference/annotate_clonality.html

(These pages describe all functions in each family, e.g., airr_diversity_shannon(), airr_public_jaccard(), etc.)

Not all features from immunarch 0.9 are implemented in 0.10 yet. I’m working on it.

Notes on `immundata` status

immundata powers ImmunData and lets you analyse large datasets without loading everything in RAM. It supports bulk, single-cell, and spatial workflows. [immunomind.github.io]