Cross-frequency interactions, a form of oscillatory neural activity, are thought to play an essential role in the integration of distributed information in the brain. Indeed, phase-amplitude interactions are believed to allow for the transfer of information from large-scale brain networks, oscillating at low frequencies, to local, rapidly oscillating neural assemblies. A promising approach to estimating such interactions is the use of transfer entropy (TE), a non-linear, information-theory-based effective connectivity measure. The conventional method involves feeding instantaneous phase and amplitude time series, extracted at the target frequencies, to a TE estimator. In this work, we propose that the problem of directed phase-amplitude interaction detection is recast as a phase TE estimation problem, under the hypothesis that estimating TE from data of the same nature, i.e., two phase time series, will improve the robustness to the common confounding factors that affect connectivity measures, such as the presence of high noise levels. We implement our proposal using a kernel-based TE estimator, defined in terms of Renyi’s α entropy, which has successfully been used to compute single-trial phase TE. We tested our approach on the synthetic data generated through a simulation model capable of producing a time series with directed phase-amplitude interactions at two given frequencies, and on EEG data from a cognitive task designed to activate working memory, a memory system whose underpinning mechanisms are thought to include phase–amplitude couplings. Our proposal detected statistically significant interactions between the simulated signals at the desired frequencies for the synthetic data, identifying the correct direction of the interaction. It also displayed higher robustness to noise than the alternative methods. The results attained for the working memory data showed that the proposed approach codes connectivity patterns based on directed phase–amplitude interactions, that allow for the different cognitive load levels of the working memory task to be differentiated.