Objective: Mendelian rioritisedn (MR) uses genetic variants as instruments to estimate a causal effect of an exposure on an outcome. Specific conditions need to be satisfied to ensure the validity of the genetic variant as an instrumental variable (IV). Pleiotropy is one of the reasons for invalid IV and erroneous causal estimation. We applied natural language processing (NLP) on multidimensional GWAS datasets enriched with gene-expression information to select appropriate IVs for MR analysis. The selected set of IVs was then used in MR analysis to investigate the causal effect of blood pressure (BP) on cardiovascular outcomes (ischaemic stroke, atrial fibrillation (AF), coronary artery disease (CAD), and chronic kidney disease (CKD)). Design and method: We curated corpus of data by integrating GWAS results, gene expression, single nucleotide polymorphism (SNP) proxies and proximate genes on 2955 phenotypes. We used chromosome cytobands to partition the genome into documents, in which SNPs, genes, tissues and phenotype domains are treated as the terms. We subsequently calculated term frequency – inverse document frequency (TF-IDF) for all the terms in the corpus and identified the cytobands with the highest TF-IDF for BP. Independent non-pleiotropic SNPs associated with BP in the rioritised cytogenetic bands were selected as IVs for two-sample MR analysis. We performed inverse-variance weighted (IVW) MR and MR Egger analysis using GWAS meta-analysis systolic BP (SBP) estimates and the following outcomes: ischaemic stroke (N = 440328), AF (N = 1030836), CAD (N = 547261), CKD (N = 117165). Analyses were performed on the MR-Base platform. Results: We identified 10 non-pleiotropic, BP-associated SNPs from the top 10 cytobands with the highest TF-IDF scores for BP over the whole genome. Two-sample MR analyses showed a significant causal relationship between genetically determined greater SBP and higher risk of ischemic stroke, AF but not for CAD and CKD (ischemic stroke: IVW = 0.03 ± 0.01, P = 0.006; MR-Egger = 0.05 ± 0.02, P = 0.09; AF: IVW = 0.03 ± 0.006, P = 9.017E-07; MR-Egger = 0.04 ± 0.01, P = 0.01) (Figure). Conclusions: Our joint NLP and MR analyses allowed selection of a parsimonious set of instrumental variable SNPs for MR analyses and may be an option for high-throughput analyses across multiple traits.