The document proposes two approaches to better exploit source-side monolingual data in neural machine translation when parallel corpora are insufficient: (1) A self-learning method generates synthetic parallel data to augment training, and (2) A multi-task learning framework jointly trains two NMT models to predict translations and reordered source sentences. Experiments applying these methods to Chinese-English translation show significant BLEU score improvements over a strong attention-based NMT baseline, with gains increasing as more monolingual data is added.