Sliding Window Feature Extraction
Every 10 milliseconds, a 20 millisecond window is
used for feature extraction
Speech signal over 10 millisecond interval grid