Abstract The Cancer Genome Atlas (TCGA) consortium measured genome-wide gene expression using RNA-seq for 336 skin cutaneous melanoma (SKCM) samples among which 272 were clinically classified as metastatic SKCM tumors and the remaining 64 as primary SKCM tumors. We aimed to identify gene signatures that separate the primary SKCM from the metastatic SKCM samples. Our initial analysis showed that the primary and metastatic SKCM samples shared enough similarity at the gene expression level so that misclassification rates were unacceptably high. We reasoned that some of the primary SKCM tumors plausibly might have evolved to resemble the metastatic tumors in their gene expression. This idea led us to propose an alternative computational method to find a gene signature for accurate classification but does so while making explicit allowance for allegiance switching moving samples from one group to the other, e.g., primary to metastatic or vice versa. Based on an iterative stochastic search algorithm that delivers nearly optimal gene signatures for classification, our alternative algorithm is rooted in the groups defined by clinical classification but allows for switching between groups when a sample is clearly discordant with other group members based on its gene expression profile. We began by seeking such near-optimal partitioning of the 336 samples into the primary and metastatic groups based on the gene expression data using the clinical classification as the guide/basis. Specifically, our algorithm gives each of the 336 samples a small but equal probability to be switched to the other group at each iteration (e.g., from metastatic to primary, or vice versa). We carried out a massive computational search for gene signatures (a set of 20 genes) that provide a near optimal partitioning of the groups while keeping the clinical classification for most of the 336 samples but reassigning a few to the other group. Distinguishing between the newly re-assigned primary and metastatic partitioning now possible based on gene expression data. The search carried out 5,000 independent runs of our alternative stochastic search algorithm to generate 5,000 near-optimal gene signatures and 5,000 sets of near-optimal partitioning of the groups. By examining how often a sample was assigned to the primary and metastatic groups, we could estimate the proportion of runs where the sample was classified as a primary or metastatic SKCM tumor. We found that nearly all the clinically classified metastatic tumors were consistently assigned to the metastatic tumor group in 90-100% of the runs whereas the clinically classified primary SKCM tumors were often reassigned to the metastatic tumor group in proportions ranging from 2% to 80%. This result suggests that the gene expression profiles of many primary tumors resemble those of metastatic tumors to various degrees. Gene ontology analysis of the 500 most frequently selected genes (those appearing most frequently in the 5,000 gene signatures) suggested that the top-ranked genes are enriched in ectoderm and epidermis development, epithelia and epidermal cell differentiation, kerationization, and regulation of inflammatory and defense response. In summary, we have developed a unique computational method that not only assesses the relevance of genes in sample classification but also classifies each sample probabilistically to uncover the true tumor status. Our analysis may provide useful information for treatment and disease management. Citation Format: Yuanyuan Li, Juno Krahn, Leping Li. Assessing the similarity and dissimilarity between primary and metastatic melanoma using gene expression data. [abstract]. In: Proceedings of the AACR Special Conference on Advances in Melanoma: From Biology to Therapy; Sep 20-23, 2014; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(14 Suppl):Abstract nr A17.