Novel user clustering and pilot assignment schemes are proposed for non-orthogonal multiple access (NOMA) aided multicell massive multiple-input multiple-output systems. Our proposed designs leverage the spatial covariance matrices of users, and they are applicable to both overloaded and underloaded cases. Users having linearly dependent covariance matrices are grouped into the same cluster, and they are assigned with orthogonal pilots to mitigate intra-cluster pilot contamination. These pilots are shared among the users in different clusters to reduce the training overhead. Then, the inter-cluster pilot contamination can be mitigated by exploiting asymptotically linearly independent covariance matrices of pilot-sharing users located in different clusters. The achievable downlink rates are derived by capturing the detrimental effects of practical transmission impairments and partial channel knowledge. Our analysis reveals that the user rates are not fundamentally limited by the intra/inter-cluster pilot contamination arising from pilot reuse for the underloaded case. We show that the achievable rate of each user increases without bound under zero-forcing precoding as the number of base-station antennas grows large. However, for the overloaded case, the users, which share the same spatial direction and located within a NOMA cluster, will have to share pilots. Thus, the achievable rate gains must be traded-off to enable massive connectivity. Three transmit power control policies are designed to maximize the sum rate, while achieving max-min user-fairness by mitigating the adverse near-far effect.