Seismic facies analysis is crucial in hydrocarbon exploration and development. Traditional machine learning approaches typically require manual selection of attributes and lack interpretability analysis. We propose an interpretable framework, multi-attribute masking contrastive learning (MAMCL), designed to adaptively select, explore and aggregate seismic attributes for seismic facies analysis. The MAMCL framework includes a depthwise CNN module for feature extraction and an iTransformer module for feature aggregation. Based on the assumption that different attributes computed on the same seismic sample imply common information associated with the same geologic facies, we formulate an unsupervised strategy of contrastive learning to pre-train the MAMCL framework for refining the attributes. This pre-training method encourages the network to extract and integrate highly correlated attribute features by enhancing the expression of commonalities within the same sample, and implicitly increase the distance between features of different categories by differentiating the expressions of different samples. Ultimately, these refined features only need to be input into a simple clustering algorithm, such as K-Means, to achieve seismic facies classification. MAMCL requires no labels or manual selection of attributes and can utilize the self-attention mechanism of iTransformer to compute adaptive attribute weights, facilitating interpretability analysis. We applied MAMCL framework to both unlogged turbidite channel systems in Canterbury Basin, New Zealand, and logged Chengdao area in Bohai Bay Basin, China, achieving reliable classification results and providing interpretability analysis.