Non-intrusive load monitoring (NILM) offers precise insights into equipment-level energy consumption by analyzing current and voltage data from residential smart meters, thus emerging as a potential strategy for demand-side management in power systems. However, a prevalent limitation in current NILM techniques is the presupposition of a known inventory of household appliances, an assumption that often becomes impractical due to the regular introduction of new appliances by consumers. To address this challenge, our approach integrates a vision transformer network with an additional detection head (ViTD), utilizing V-I trajectory images. Initially, the ViT model is trained to classify known appliances. Subsequently, an additional detection head is incorporated to manipulate the embedded features, encouraging the formation of distinct, compact class centers for the known appliance categories. During testing, samples are identified as either known or unknown appliances based on their proximity to these class centers. We utilize two public datasets, PLAID and WHITED, to demonstrate the effectiveness and superiority of our proposed method.