Actor-only deterministic policy gradient via zeroth-order gradient oracles in action space