U-Net Experiments — Slakh2100
Instruments: Bass | Drums | Guitar | Piano
BS-RoFormer Experiments — MUSDB18
Instruments: Bass | Drums | Vocals | Other
Welcome to the demo page for "Improving Music Source Separation with Diffusion and Consistency Refinement."
We propose an approach to music source separation that uses a generative diffusion model as a last-stage refinement on top of a deterministic separator, progressively enhancing the separated sources through iterative denoising. While the diffusion refinement yields measurable quality gains, it requires iterative steps at inference, increasing computational cost. To speed up the inference process, we apply consistency distillation, reducing inference to a single step while maintaining quality; with two or more steps, the distilled model even surpasses the diffusion-based approach.
Crucially, our method is architecture-agnostic: we demonstrate state-of-the-art results when applied to both a custom U-Net-based separator on Slakh2100 and the state-of-the-art BS-RoFormer model on MUSDB18, showing that the refinement generalizes across backbone architectures.
Instruments: Bass | Drums | Guitar | Piano
Instruments: Bass | Drums | Vocals | Other