We present the first-generation global tomographic model constructed based on adjoint tomography, an iterative full-waveform inversion technique. Synthetic seismograms were calculated using GPU-accelerated spectral-element simulations of global seismic wave propagation, accommodating effects due to 3D anelastic crust & mantle structure, topography & bathymetry, the ocean load, ellipticity, rotation, and self-gravitation. Frechet derivatives were calculated in 3D anelastic models based on an adjoint-state method. The simulations were performed on the Cray XK7 named ‘Titan’, a computer with 18,688 GPU accelerators housed at Oak Ridge National Laboratory. The transversely isotropic global model is the result of 15 tomographic iterations, which systematically reduced differences between observed and simulated three-component seismograms. Our starting model combined 3D mantle model S362ANI (Kustowski et al. 2008) with 3D crustal model Crust2.0 (Bassin et al. 2000). We simultaneously inverted for structure in the crust and mantle, thereby eliminating the need for widely used ‘crustal corrections’. We used data from 253 earthquakes in the magnitude range 5.8~ ≤ ~Mw~ ≤ ~7.0. For the first 12 iterations, we combined ∼30 s body-wave data with ∼60 s surface-wave data. The shortest period of the surface waves was gradually decreased, and in the last three iterations we combined ∼17 s body waves with ∼45 s surface waves. We started using 180 min-long seismograms after the 12th iteration and assimilated minor- and major-arc body and surface waves. The 15th iteration model features enhancements of well-known slabs, an enhanced image of the Samoa/Tahiti plume, as well as various other plumes and hotspots, such as Caroline, Galapagos, Yellowstone, and Erebus. Furthermore, we see clear improvements in slab resolution along the Hellenic and Japan Arcs, as well as subduction along the East of Scotia Plate, which does not exist in the starting model. Point-spread function tests demonstrate that we are approaching the resolution of continental-scale studies in some areas, for example underneath Yellowstone. This is a consequence of our multi-scale smoothing strategy, in which we define our smoothing operator as a function of the approximate Hessian kernel, thereby smoothing gradients less wherever we have good ray coverage, such as underneath North America.