A new, statistically based method of clustering previously used by authors for analysis of frequency and point data and as well for searching clusters in samples and graphs is applied to the task of searching clusters in high throughput sequence data. Such problem often arises in bioinformatics in attempt to reveal different forms may be presented in one nosology. Nevertheless methods existing in literature have no sufficient statistical background and the question of significance for revealed clusters usually is not considered. Complications due different sequencing deep while construction the library and big biological variations in gene expression are common. We propose method for discovering statistically significant clusters based on ranks of relative frequencies and nonparametric criteria to overcome these obstacles. Such approach allows avoiding dependence from different sequence deep, knowledge of exact form for distribution of gene expression and hypothesis that all expressions have the same functional form.