Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees