Audio indexing for speech skimming
Speaker A
Speaker B
Speech Signal gets indexed: