In this paper, we explore the potential of neural networks for making space weather predictions based on near-Sun observations. Our second goal is to determine the extent to which coronal polarimetric observations of erupting structures near the Sun encode sufficient information to predict the impact these structures will have on Earth. We focus on predicting the maximal southward component of the magnetic field (-Bz) inside an interplanetary coronal mass ejection (ICME) as it impacts the Earth. We use Gibson&Low (G&L) self-similarly expanding flux rope model (Gibson&Low 1998), which allows to consider CMEs with varying location, orientation, size, and morphology. We vary 5 parameters of the model to alter these CME properties, and generate a large database of synthetic CMEs (over 36k synthetic events). For each model CME, we synthesize near-Sun observations, as seen from an observer in quadrature (assuming the CME is directed Earthwards), of either three components of the vector magnetic field (“Experiment 1”), or of synthetic Stokes images, (“Experiment 2”). We then allow the flux rope to expand and record max(-Bz) as the ICME passes 1AU. We further conduct two separate machine learning experiments and develop two different regression-based deep convolutional neural networks (CNNs) to predict max(-Bz) based on these two kinds of the near-Sun input data. Experiment 1 is a proof of concept, to see if a 3-channel CNN (hereafter CNN1), similar to those used in RGB image recognition, can reproduce the results of the self-similar (i.e. scale-invariant) expansion of the G&L model. Experiment 2 is less trivial, as Stokes vector is not linearly related to B, and the line-of-sight integration in the optically thin corona presents additional difficulties for interpreting the signal. This second CNN (hereafter CNN2), although resembling CNN1 in Experiment 1, will have a different number of layers and set of hyperparameters due to a much more complicated mapping between the input and output data. We find that, given vector B, CNN1 can predict max(-Bz) with 97% accuracy, and for the Stokes vector as input, CNN2 can predict max(-Bz) with 95%, both measured in the relative root square error.