Scalable Reinforcement Learning Policies for Multi-Agent Control