Combining Reinforcement Learning and Diffusions via Weighted Maximum Likelihood Estimation