A multitask convolutional neural network (CNN) is trained to localize the instantaneous position of a motorboat throughout its transit past a wide aperture linear array of hydrophones located 1 m above the sea floor in water 20 m deep. A cepstrogram database for each hydrophone and a cross-correlogram database for each pair of adjacent hydrophones are compiled for multiple motorboat transits. Cepstrum-based and correlation-based feature vectors (along with ground-truth source bearing and range data) form the inputs to train three CNNs so that they can predict the instantaneous source range and bearing for other "unseen" motorboat transits. It is shown that CNNs operating on multi-sensor cepstrum-based feature maps are able to predict the instantaneous range and bearing of a transiting motorboat, even when the source is near an endfire direction. Also, multi-sensor generalised cross correlation-based feature maps are able to predict the range and bearing of a transiting motorboat in the presence of interfering multipath arrivals. When compared with the cepstrum-only CNN, cross correlation-only CNN, and the conventional model-based method of passive ranging by wavefront curvature, the combined cepstrum-cross correlation CNN is shown to provide superior source localization performance in a multipath underwater acoustic environment.