Thanks for the excellent summary Martin. So that people understand better why I suggested excluding the 6 named studies, let me explain a little about the six studies. They reported many sequences, and these either explicitly came from a single individual, or appear to have come from a single individual. The studies were important at the time they were published, but they were exploring the nature of the repertoire at a time when almost nothing was known. I doubt any of the authors expected all of their sequences to be incorporated into a ‘semi-official’ repertoire of germline genes. In fact, a number of the publications include explicit statements that the sequences are likely to include sequencing errors. The fact that this is true is clear when you consider the sequences they reported, using the IMGT names. At the time, they did not think of them as genes and allelic variants, but rather just as sequences that they had generated, some of which were likely to correctly report germline sequences. They were:
Andris et al: IGHV2-5*04, *05, *06, *07, *08 and 09. IGHV2-7001, *09, *10, *11, *12
Campbell et al: IGHV2-70*02, *03, *06, *07, 08. IGHV4-403, 04, 05.
IGHV4-2805. IGHV4-30-403, 04. IGHV4-3103, *06, *07, *08, 09.
IGHV4-3403, *06, 07. IGHV6-102.
Adderson et al: IGHV3-15*01, *03, *04, *05, *06, *07, 08. IGHV3-4902.
Olee et al: IGHV3-30*01, *04, *05, *06, *07, *09, *10, *11, *12, *13,
*14, *15, *16, *17, 18, 19. IGHV3-30-302. IGHV3-3301, *03, *04,
05. IGHV3-6403, *04, *05.
Weng et al: IGHV4-2803, 04. IGHV4-30-203. IGHV4-30-402. IGHV4-31*03,
10. IGHV4-3404, *05, 09, 10. IGHV4-3906. IGHV4-5907, 10.
vans Es et al: IGHV4-30-202. IGHV4-3104, 05. IGHV4-3408. IGHV4-3905. IGHV4-5903, *04, *05, *06.
I should also clarify that I am very comfortable with sequences like these moving to a ‘red’ category, rather than being completely discarded. In fact I think it is very important that all sequences remain accessible in any new database, even if they are clearly not real germline sequences. If we do not retain them, it will be impossible for people in the future to make sense of historical reports.