5G networks aim to provide orchestration of services across multiple administrative domains through the concept of federation. In this paper, we are exploring the federation feature of a platform for 5G transport network of vertical services. Then we formulate the decision problem that directly impacts the revenue of 5G administrative domains, and we propose as solution a Q-learning algorithm. The simulation results show near optimum profit maximization and a well-trained Q-learning algorithm can outperform the intuitive ‘greedy’ approach in a realistic scenario.