Gravar-mail: Direct reinforcement learning, spike time dependent plasticity and the BCM rule