At all times-on machine studying fashions require a really low reminiscence and compute footprint. Their restricted parameter depend limits the mannequin’s capability to be taught, and the effectiveness of the standard coaching algorithms to search out the very best parameters. Right here we present {that a} small convolutional mannequin will be higher educated by first refactoring its computation into a bigger redundant multi-branched structure. Then, for inference, we algebraically re-parameterize the educated mannequin into the single-branched kind with fewer parameters for a decrease reminiscence footprint and compute price. Utilizing this system, we present that our always-on wake-word detector mannequin, RepCNN, supplies an excellent trade-off between latency and accuracy throughout inference. RepCNN re-parameterized fashions are 43% extra correct than a uni-branch convolutional mannequin whereas having the identical runtime. RepCNN additionally meets the accuracy of advanced architectures like BC-ResNet, whereas having 2x lesser peak reminiscence utilization and 10x sooner runtime.