A general theory for computing information transfers in nonlinear stochastic systems driven by deterministic forcings and additive and/or multiplicative noises, is presented, satisfying quite general boundary conditions in the state-space: either closed, periodic or satisfying a vanishing probability density function. It extends the Liang-Kleeman (LK) framework of causality inference to nonlinear cases based on information transfer across system variables, which is presented in detail in (Liang, 2016. Information flow and causality as rigorous notions ab initio. Phys. Rev. E, 94: 052201. DOI: 10.1103/PhysRevE.94.052201). We present an effective method of computing formulas of the rates of Shannon entropy transfer (RETs) between selected causal and consequential variables, the ‘Causal Sensitivity Method’ (CSM), relying on the estimation from data of conditional expectations of the system forcings and their derivatives. Those expectations are approximated by nonlinear differentiable regressions, leading to a much easier and more robust way of computing RETs than the ‘brute-force’ approach which calls for the computation of numerical integrals over the state-space and the knowledge of the multivariate probability density function of the system. The CSM is furthermore fully adapted to the case where no model equations are available, starting with a nonlinear model fitting from data of the consequential variables, with the subsequent application of CSM to the fitted model. RETs are decomposed into deterministic and stochastic components, being compensated by the self generation of entropy in ergodic conditions. Moreover, RETs are decomposed into sums of single one-to-one RETs plus synergetic terms (of pure nonlinear nature) accounting for the joint causal effect of groups of variables. State-dependent (or specific) RET formulas are also introduced, puting in evidence where in state-space the entropy transfers and local synergies are more relevant. A comparison of the RETs estimations is performed between: 1) the ‘brute-force’, expensive (taken as benchmark), probability-density-based approach (AN), 2) the CSM-based approach with and/or without model fitting, and 3) the multivariate linear (ML) approach, in the context of two different models: (i) a model derived from a potential function and (ii) the classical chaotic Lorenz system, both forced by additive and/or multiplicative noises. The analysis demonstrates that the CSM estimations are robust, cheaper, and less data-demanding than the AN-reference values in the different experiments, providing evidence of the possibilities and generalizations offered by the method (e.g. causality diagnostics between subspaces) and opening new perspectives on real-world applications.