One of the biggest dependencies of reinforcement learning is sample efficiency since reinforcement learning agents tend to use an excessively large number of episodes to train. These episodes can be expensive, especially when they are conducted in an online manner. Model-Based Reinforcement Learning (MBRL) and offline learning are two methods to aid this problem; MBRL has been shown to improve sample efficiency while offline learning can reduce the number of online episodes needed by substituting them with less expensive offline ones. However, the use of these two methods together is more challenging, encountering issues such as training off of incomplete distributions. We explore these challenges by testing different combinations of offline and online model-learning through learning the legal moves in the board game Othello. During this process, we encounter the additional challenge of only positive reinforcement where offline episodes only provide information about legal moves. To address this problem, we propose a method named synthetic negative reinforcement which uses pre-existing agent knowledge to make up for the lack of information on illegal moves. Our results demonstrate the efficacy of offline learning using synthetic negative reinforcement on robust distributions of offline data, with agents achieving greater than 97% accuracy predicting the legality of moves. We also demonstrate the evident obstacle that skewed distributions provide to offline model-learning.