Consistency_separation

Improving Source Extraction with Diffusion and Consistency Models

Welcome to the demo page for "Improving Source Extraction with Diffusion and Consistency Models" paper.

In this work, we investigate the integration of a score-matching diffusion model into a standard U-Net architecture for time-domain musical source separation. Further, since diffusion models typically suffer from iterative and thus slow sampling processes, we employ consistency distillation to accelerate the sampling speed, bringing it closer to that of a Deterministic model with no loss in quality. Our model, trained on the Slakh2100 dataset targeting four instruments (bass, drums, guitar, and piano), demonstrates significant improvements across objective metrics compared to the baseline methods.

On this page, we present the separetion demos of 5 different scenarios: Deterministic model, Diffusion Model, and CD with 1, 2 and 4 denoining steps.

Original

Track	Mix	Bass	Drums	Guitar	Piano

Deterministic model

Track	Mix	Bass	Drums	Guitar	Piano

Diffusion

Track	Mix	Bass	Drums	Guitar	Piano

CD_1_step

Track	Mix	Bass	Drums	Guitar	Piano

CD_2_step

Track	Mix	Bass	Drums	Guitar	Piano

CD_4_step

Track	Mix	Bass	Drums	Guitar	Piano