In this study, a remote homologous protein detection problem, which is a problem belonging to the field of bioinformatics, which has a great contribution in the field of medicine, is discussed. Protein sequences taken from the SCOP database, which is an important and widely used database for proteins, were tested for remote homolog protein detection in this study. Feature vectors were obtained from the protein sequences using the bag of word model. These obtained feature vectors were classified using the kNN classifier algorithm. In this classification, the different distances were used as Bray Curtis, Chebyshev, Cosine, Dice, Euclidean, Hamming, Jaccard, Kulczynski, Matching coefficient, Minkowski, RogersTanimoto, RussellRao and SokalMichener on kNN classifier for remote homolog protein detection. There is proposed special k fold value formula for prevent imbalanced data problem. It has observed that the kNN algorithm with the Bray Curtis distance with cross validation with special k fold value shows the most successful performance with 99% accuracy.