Sponsored by the AIRR Community

Immunarch will significantly evolve, but it will break things – and we need your help

Hi b-t.cr forum,

It is Vadim Nazarov, the main author of Immunarch (https://immunarch.com/), previously tcR. In this post, I want to share our plans for the upcoming v1.0.0 release that we (finally!) plan to publish – and ask for your help. It will bring valuable enhancements to our community of immune repertoire researchers, but also break some things:

  • Things will be broken – focus on AIRR community data format: Some pipelines will be broken, and we will smoothen the transition as much as possible. The days are over for unique data format per tool because the ecosystem is mature enough to have its foundation. The AIRR Community did a fantastic job in creating a standardized data format for AIRR data, and Immunarch will use it in its core. The transition will be painful at first, but in order to ensure future stability and robustness of both the tool and the whole ecosystem, we need to do it.

  • Paired-end TCR and BCR analysis: Immunarch will offer full support for paired-end TCR and BCR analysis, allowing you to explore your data more comprehensively and with any clonotype model you have in mind, e.g., CDR3-only, CDR3+V, CDR3 from both chains, etc.

  • Single-cell AIRR analysis and data integration: We’re expanding our capabilities to provide full support for single-cell AIRR analysis and seamless integration of per-receptor data from various modalities, e.g., transcriptomics, immunogenicity, TCR generation probabilities, clinical data, time points, etc.

  • Make Immunarch the single interface for any AIRR-related analysis: Provide an interface to widely used tools for additional receptor data modalities: generation probability tools (IGOR, OLGA), immunogenicity prediction, MHC binding prediction, etc.

  • Enhanced single-cell immunogenomics visualization: Improved tools for analyzing and visualizing the data, making it easier to derive insights from combined immune receptor and multi-modal datasets.

  • Support for out-of-memory data: To handle extra-large datasets, Immunarch will introduce support for out-of-memory datasets to ensure smooth operation even with substantial data volumes. We plan to employ Apache Arrow and DuckDB, and allow extension to novel tools through a common interface.

  • Software modularity and support for both R and Python: We will split immunarch into smaller software tools with limited scopes, and we will start with two: one will be focused on data, another one – on analytics. It will allow greater extensibility and interoperability with other tools. We will add initial support for Python for the data tool to allow developers and data engineers to make efficient computational platforms using Immunarch data structures as the backend.

We have an implementation plan and first prototypes, but we want to talk to you first to:

  • validate the plan,
  • understand, how we can solve these problems more efficiently, and
  • what we should prioritize.

If you’re working on similar problems or have ideas on how to further improve Immunarch or your single-cell immunomics routine, we welcome discussions and interviews to explore the best ways to implement these features. This is also the best way to get support and discuss your current challenges.

Please contact me through the email to schedule a call: vadim.talk@immunomind.com

Thank you!

Vadim Nazarov

1 Like