Subset #2 is the largest subset carrying stereotyped B cell receptor immunoglobulin (BcR IG) in chronic lymphocytic leukemia (CLL). This particular BcR IG is composed of heavy (HC) and light (LC) chains encoded by the IGHV3-21 and the lambda IGLV3-21 gene, respectively. The clonotypic IGHV3-21 genes display a variable load of somatic hypermutation (SHM), being mostly classified as mutated (M-CLL) but also including unmutated (U-CLL) cases. Subset #2 cases, independently of the SHM status, have a particularly dismal clinical outcome similar to that of patients with TP53 aberrations, although lacking such aberrations. Subset #2 BcR IG display a series of distinctive features, including conservation at certain VH and VL CDR3 positions and recurrent SHMs; as well as a capacity for self-association leading to cell autonomous signaling that is critically dependent on a substitution of Arginine (R) for Glycine (G) introduced by SHM at the lambda VL-CL linker region. These features implicate antigen selection in CLL subset #2 ontogeny. However, the available molecular evidence derives from low throughput immunogenetic analysis, precluding comprehensive assessment of antigenic impact on (sub)clonal composition. Here, we sought to overcome this limitation by performing next-generation sequencing (NGS) of HC and LC gene rearrangements of 20 subset #2 patients. RT-PCR products amplified by the IGHV3-21/IGHJ6 and IGLV3-21/IGLC primer pairs, respectively, were subjected to NGS on the MiSeq Illumina Platform. NGS data was analyzed by a validated bioinformatics pipeline. Rearrangements with identical CDR3 amino acid (aa) sequences were defined as clonotypes, whereas clonotypes with different aa substitutions within the V-domain were defined as subclones. Starting with HCs, we obtained 3,340,508 (mean: 291,751, range: 101,231-186,055) productive reads. On average, each analyzed sample carried 92 distinct clonotypes (range: 71-152), with the dominant clonotype having a mean frequency of 96% (range: 67-99%): in all cases the dominant clonotype was identical to that determined by Sanger sequencing. The dominant clonotype displayed considerable intraclonal heterogeneity with a mean of 5,082 subclones/sample (range: 2,946-11,041). Turning to LCs, we obtained 5,094,045 (mean: 231,547, range: 38,036-507,949) productive reads. LCs carried a higher number of distinct clonotypes/sample compared to their partner HCs (mean 222, range: 156-306). The dominant clonotype had a mean frequency of 96% (range: 74-98%); similar to HCs, it was identical to that determined by Sanger sequencing. Intraclonal heterogeneity was observed in the LCs as well, with a mean of 7,382 subclones/sample (range: 1,946-11,866), hence more pronounced vs their partner HCs. Viewing the entire subset #2 VH or VL CDR3 dataset (i.e. the CDR3 aa sequences from all clonotypes of all cases) as a single entity branching through diversification enabled the identification of 2 distinct VH CDR3 sequences present at varying frequencies in 16 and 13 cases, respectively; and, 3 distinct VL CDR3 sequences present at varying frequencies in all 20 cases: these results allude to important constraints on the composition of the antigen binding site. Focusing on SHM, the following notable observations could be made. (i) The G-to-R substitution at the VL-CL linker was a clonal event in all cases with R being degenerately encoded by different nucleotide sequences; altogether, these findings underscore the seminal role of this recurrent SHM, likely due to mediating self-association. (ii) A recurrent 3-nucleotide deletion was detected in the VH CDR2 of all cases, strongly supporting functional pressure. This change, previously identified by Sanger sequencing as a recurrent SHM in subset #2 (albeit at a frequency of only 25%), was clonal in 4 cases and subclonal in the remainder, where it was present in an average of 105 subclones/sample (range: 1-369). (iii) Certain positions in both the VH and VL domain bore the same aa substitution, mostly at subclonal level: the prime example concerned the G for Serine (S) substitution within the VL CDR3, detected in all samples at a mean frequency of 44.2% (range: 6.3-87%). In conclusion, we provide compelling immunogenetic evidence for functional pressure in the ontogeny of CLL subset #2. On this evidence, subset #2 emerges as perhaps the most striking example of antigen-driven leukemogenesis reported thus far. DisclosuresGemenetzi:Gilead: Research Funding. Agathangelidis:Gilead: Research Funding. Stamatopoulos:Abbvie: Honoraria, Research Funding; Gilead: Honoraria, Research Funding; Janssen: Honoraria, Research Funding. Hadzidimitriou:Abbvie: Research Funding; Gilead: Research Funding; Janssen: Honoraria, Research Funding.