We report a new Multi-GPU (Graphical Processor Unit) implementation of real-time time-dependent Auxiliary Density Functional Theory (DFT) for simulations of attosecond electronic dynamics in molecular systems subjected to strong perturbations. Our code relies on the Kohn-Sham formalism of DFT and has been implemented in the deMon2k Fortran code. We expand single-particle wave functions (i.e molecular orbitals) as linear combinations of Gaussian-type-orbitals centered on atoms. The density matrix propagation is carried out on GPU while the Kohn-Sham potential is operated on CPUs (Central Processor Unit) with the help of variationally fitted densities. We propose a parallelization strategy using the MAGMA/CUDA libraries to calculate the exponential of dense Hermitian matrices entering the mathematical definition of the propagator, here using Taylor expansions. We report performance benchmarks on water droplets and on fullerenes (C50 to C540). They show a clear advantage of GPU over CPU (using the Scalapack library). The benchmarks also show the benefit of using more than one GPU for systems comprised of up to more than 10,000 basis functions. There, a speed-up of almost 40 between pure 40 CPU and four 4 GPU is obtained. Attosecond electron dynamics simulation in molecular systems comprised of several thousands of electrons becomes amenable to routine simulations in our code. We assess the accuracy of the GPU implementation considering various applications, namely, the calculation of extreme UV absorption spectra with non-Hermitian dynamics, the response of C180 to an electric perturbation, and finally the irradiation of a DNA/protein complex by a 0.4 MeV proton. The results demonstrate the robustness of the implementation. This work also paves the way for future even more efficient implementations. Program SummaryProgram title: deMon2k-RT-TDDFTCPC Library link to program files: https://doi.org/10.17632/whwc5zkst5.1Developer's repository link: DOI: 10.5281/zenodo.8301468Licensing provisions: CeCILL Free Software License Agreement v2.1Programming language: FortranExternal routines/libraries: MPI, BLAS/LAPACK, CUDA, MAGMASupplementary material: User manual, installation instructions, test cases. Nature of problem: Electron dynamics within molecular systems is amenable to simulation with Real-Time Time-Dependent Density Functional Theory (RT-TD-DFT). The methodology consists in propagating on the attosecond time scale the single particle electronic wave functions on the underlying Kohn-Sham potential [1,2]. In deMon2k-RT-TDDFT, the propagation is coupled to the Auxiliary DFT framework for fast evaluation of the potential [3]. A remaining computational bottleneck is the matrix exponential calculation of the Hermitian Hamiltonian matrix. This task is achieved in deMon2k via diagolanization or expansions formulas (Taylor, Chebychev or Baker-Campbell-Hausdorff). The operation is cumbersome on central-processor-unit machines for large matrices (e.g. > 5000 × 5000), which corresponds to molecular systems comprising around 2,000 electrons. Alternatives are needed to overcome this bottleneck.Solution method: We propose to carry out the matrix exponentiation on graphical-processor-units thanks to the MAGMA[4]/CUDA libraries. The code permits a drastic reduction of the computational cost, opening the attosecond dynamic simulations of large molecular systems comprised thousands of electrons with controlled accuracy. The method is tested here for Taylor expansions. Additional comments including restrictions and unusual features: The code described in this article opens the realization of RT-TD-ADFT simulations. The code is linked to the Master version 6.1.6 of deMon2k (http://www.demon-software.com), which must be pre-installed by the user.