This paper proposes effective communication strategies for Wireless Body Area Networks (WBANs) that consist of wearable or implantable sensor nodes placed in, on/around the human body to send body vitals to a sink. The main research challenges for communication strategy formulation include limited energy resources and varying link conditions. Though energy harvested sensor nodes partially address the problem of energy efficiency, finding an optimal balance between the energy constraint of the nodes and communication reliability is still challenging. Since data loss in such networks may prove to be fatal, it is important to investigate the problem prior to deployment and come up with effective communication strategies for initiating post-deployment operations. Hence, in this paper, the nodes are stochastically modeled as a Markov Decision Process. There is a need to adapt to the changing ambient conditions through exploration and exploitation. So, a modified Q-learning technique is proposed for post-deployment decision-making by the WBAN nodes subject to the dynamic ambient conditions. The effectiveness of the proposed strategy is validated through extensive simulation and compared with state-of-the-art works. The performance of the proposed approach is also verified with a real-life dataset. The results demonstrate that around 90% successful data delivery to sink could be made with the proposed scheme in the real-life scenario.