Abstract: ClariNet provides high-quality speech but it is based on a tricky knowledge-distillation training method with auxiliary losses and large computational requirements. In this work, we apply ...