Sponsored by the AIRR Community

Germline Set format - The Way Forward

Oh, ok. I guess I would just make explicit on slide 4 that these are non-canononical files meant to ease curation in individual instances.

Done. Thanks for the feedback!

@w.lees, this is an excellent summary of what we were thinking.

Hereā€™s what @cswarth built:

Currently the ā€œdatabaseā€ is a scraped version of IgPDB, just so that we could have something to play with.

As you describe, the long-term idea is that people can download various parts of the database as CSV files, edit them as they see fit, and then submit them. This would then create a git commit modifying the underlying JSON, which if accepted would get pushed to GitHub. We are inspired in this design by phylesystem.

Unfortunately for us, @cswarth has left my group for a much better-paying industry job, so this implementation isnā€™t going to be updated for the foreseeable future.

Germline Set Overview v3.pdf (715.2 KB)

All,

Attached is a revised overview of the Germline Set proposal, reflecting comments on the previous version and including an entity-relationship diagram for the database. I have also produced a new Germline Scheme and File Formats document. This includes a description of each field in the database, and a separate illustrative file format, showing the format in which a germline set could be downloaded for use by a parser. This is illustrative only at this stage, because there are details mentioned in previous threads which havenā€™t as yet been resolved, such as whether all chains should be in the same file or in separate files. I think we could do with some discussion on a call when time allows. But for the time being I did think it would be valuable to show some details of a possible format, in order to draw out some of the features it offers, which are highlighted in the last slide of the overview.

Terminology continues for the time being to be aligned with IMGT standards and I have revised one or two field names with this in mind.

Please let me have any comments - either here or in the Google sheet.

Thanks, and Happy New Year to you all

William

1 Like