Gravar-mail: Learning of Sub-optimal Gait Controllers for Magnetic Walking Soft Millirobots