What I believe you are trying to measure (by stating _divergence_ ) is the **PESQ** , Perceptual Evaluation of Speech Quality, of each file. This is a standarized form ITU-T recommendation P.862 (02/01) <
You have different projects implementing what you are searching for. For example <