ES-1 technical question re: sample rate

gmeredith · Post by **gmeredith** » Thu May 08, 2008 12:45 am

This question may be best answered by Korg technical dept, but please, any techs reading this feel free to comment.

Although the sample recording rate of the ES-1 mk 1 is fixed at 32k, is the sample PLAYBACK rate variable? Meaning, when you tune up a recorded sample in the ES-1, does that mean that the sample rate on playback is increased in order to produce the higher pitched note? Or does the PLAYBACK rate remain constant at 32k, and DSP processing is used to get different tuning?

In this instance, I'm not talking about applying an effect, such as the pitch shift or stretch effect. I'm talking about what happens to the playback sample rate when a sample is simply tuned up in pitch.

If it does change, what is the mechanism by which it does it? Does the processor clock increase in speed etc? And if it does, does the anti-aliasing filter move with the change in playback rate, or does it stay FIXED at half the 32k sample rate = 16k while the pitch moves up?

This question is in relation to a phenomenon I have observed, which seems to enable you to get "apparent" better sample quality out of the ES-1 (or for that matter, any sampling device).

I'll give an example using the ES-1 and a record player and a 7" 45rpm single as a demonstration, but it could also be a computer or a tape player or another synth/sampler, but the important thing is that it needs to be able to vary the playback speed of the sample, tape or record.

Put the 7" 45rpm single you want to sample on the record player, but set the record player to play at 33rpm - so that it plays the record slower and at lower pitch.

Now sample the record into the ES-1. Play it back in the ES-1 to confirm it sounds the same as the 45rpm record played at 33rpm. Now tune the sample part up in the ES-1 until it now sounds the same pitch as the record playing at the correct speed of 45rpm. You should notice that the sample sounds better than a normal ES-1 sample in the high frequencies, such as hi hats and cymbals.

Test this. Set the record at its proper 45rpm speed and sample it onto another sampling part as per the normal way of sampling. It will already be at the proper pitch as the record, so don't tune it up. Now play that sample side by side with the first sample with the pitch adjustment. The first "pitched" sample will sound better than this normally recorded sample in the high frequencies.

Is this really the case, or am I just imagining it? I've discussed it with various people, and many have said that you can't change the playback sample rate of the ES-1 by tuning a sample up, and so it is a false scenario. All that I know is that it seems to work with all the samplers I have, especially my Casio SK-8 Lo-fi sampler, because it has a low sample rate (9.3k) and the results are very noticeable. They are claiming that the "apparent" increase in high frequencies is from distortion. It doesn't sound like distortion to me, harmonic or otherwise. The sound sounds cleaner to me.

Just for reference, here was the discussion that took place:

http://launch.groups.yahoo.com/group/el ... sage/11410

Cheers, graham

fac · Post by **fac** » Thu May 08, 2008 1:27 am

I don't think the output sample rate is variable, since that would be too difficult to manage (imagine pitch-shifting each of the nine parts to a different pitch - one would have to combine nine different signals at different sample rates).

This is how it's usually done: suppose that both your sample and the output sample rate are 32 kHz. Suppose also that your sample corresponds to a C note.

Now, suppose your sample has a length of N samples; that is, N/32000 seconds. Suppose also that the output buffer is out[t].

Now, in order to play back the sample x[t] at its original pitch (speed), you would simply do something like:

out[t] = x[t % N];

where % represents the modulo operation (the remainder of the division of t by N).

However, if you want to play the sample one octave higher (double the speed), you would do something like

out[t] = x[2t % N];

Note that the output sample rate is still 32 kHz; however, you're now taking every other sample from x[t], which is somehow equivalent to playing is at twice the sample rate (not exactly, since your bandwidth is not really increased).

Now, if you want to transpose one octave lower, you could do this:

out[t] = x[t/2 % N];

In general, if you want to transpose the sample by k semitones, you would do something like:

m = pow(2, k/12);
out[t] = x[mt % N];

where pow(a,b) is a to the b-th power. Note that if the number of semitones k is 12, then m = 2, and if k = -12, then m = 1/2.

One problem here is that mt is not necessarily an integer. The easy solution is to simply truncate or round mt to the nearest integer:

out[t] = x[floor(mt) % N];

but that may produce artifacts. A better solution is to interpolate between the samples x[floor(mt) % N] and x[floor(mt + 1) % N]. Linear interpolation is the easiest and probably most used, but one usually obtains better results (less artifacts) with non-linear interpolation techniques (quadratic, cubic, splines, etc).

I hope that helps. I think I was a bit technical there.

fac · Post by **fac** » Thu May 08, 2008 1:39 am

gmeredith wrote:They are claiming that the "apparent" increase in high frequencies is from distortion. It doesn't sound like distortion to me, harmonic or otherwise. The sound sounds cleaner to me.

Let me think about this, but I think your friends are right. Pitch-shifting always introduces distortions, so technically, recording a sample at low speed and then digitally pitch-shifting it to its normal speed should usually result in a lower-quality version than simply recording the sample at its normal speed. Now, whether it sounds better or not, that's most likely personal preference

gmeredith · Post by **gmeredith** » Thu May 08, 2008 1:59 am

Hi Fac,

Thanks for that, I think I did understand what you meant. Basically you're describing that in order to get a higher pitch, for example, an octave higher, you are in essence only reading evey 2nd sample point in the memory. Is that what you meant?

If so, I would have thought you would have got a lot more distortion than what it sounds like. Obviously, the better the algorithm used by the DSP the better the quality, but all of the rack mount pitch shifters I've ever used make a lot more distortion to the original sound when pitched them at 1 octave - and the ES-1 can pitch 2 octaves, and still sounds clean.

I know that what you described is how the actual "pitch shifter" effect in the ES-1 works, I wasn't sure if that was also how the ES-1 tunes its samples.

Cheers, graham

gmeredith · Post by **gmeredith** » Thu May 08, 2008 3:02 am

It seems that from doing a search about sampling tech stuff that the "simple playback rate change" method is how they do it:

http://www.borg.com/~jglatt/tutr/synth.htm

http://www.ibiblio.org/emusic-l/back-is ... ssue18.txt

http://www.ias.uwe.ac.uk/~a-winfie/teach2003/st_pcm.htm

http://www.fortunecity.com/emachines/e11/86/synth6.html

I'm still looking at other articles, but it seems that that is what they are saying, other than the specific pitch shift/time stretch type sampler DSP stuff.

Regarding multiple samples at different speeds, they are just saying that each voice data is summed to the others before going to the DAC's to produce multi notes and pitches.

Jury is till out, though. Anyone an expert in this field?

Cheers, Graham

fac · Post by **fac** » Thu May 08, 2008 4:20 am

gmeredith wrote:It seems that from doing a search about sampling tech stuff that the "simple playback rate change" method is how they do it:

http://www.borg.com/~jglatt/tutr/synth.htm

Some of the early sample-based synths, such as the Ensoniq ESQ-1 I believe, did change the playback sample rate in order to perform pitch-shifting. However, this would require a DAC for each voice.

For example, each ESQ-1 voice reads the waveform from memory and plays it back through a DAC at a certain samplerate to achieve the desired pitch. The output of the DAC goes into the analog filter and so on.

Fully digital synths do not work this way. They do everything in the digital domain and there's only a DAC at the final output (well, two DAC's if it's a stereo output). However, you cannot simply add various signals at different sample rates to perform mixing, so all the voices must be computed at the same sample rate - therefore, pitchshifting must be performed the way I described it earlier.

The references you included in your previous post seem to cover relatively old samplers (e.g., Fairlights) which may have used the variable playback rate technique, but trust me, most digital synths after the Roland D-50 use interpolation methods to perform pitch shifting at a fixed sample rate. I'm not a synth developer, but I do have some expertise in the DSP field.

If you want to really get serious about learning the techniques behind sampling, I strongly suggest you to take a look at this book:

http://crca.ucsd.edu/~msp/techniques.htm

It's available for free in PDF format, and it was written by Miller Puckette, the developer of Max/MSP and PureData. In fact, all the examples in the book are in PureData, so I also suggest you download it (it's free) and follow the examples in the book. You can do some pretty interesting stuff with it. Puckette manages to explain stuff like sampling, pitchshifting and timestretching, filter design, and delay-based processing with the least amount of math involved. It's really a good book.

gmeredith · Post by **gmeredith** » Thu May 08, 2008 4:45 am

I was aware that the Roland stuff did it that way, I wasn't sure if the cheaper samplers did it that way. The reason being that as you increase the tuning pitch, the sample is shorter - like speeding up a record. The Roland type method keeps the sample playback length the same while changing the pitch ,which obviously shows DSP stuff happening to create the different notes.

Cheers, graham

fac · Post by **fac** » Thu May 08, 2008 4:56 am

gmeredith wrote:The Roland type method keeps the sample playback length the same while changing the pitch ,which obviously shows DSP stuff happening to create the different notes.

I'm not sure what you mean by "the Roland method". There are various techniques to perform pitch-shifting without changing the length of the sample, but that's certainly out of my knowledge. AFAIK, Roland did not implement those methods until they came out with the Variphrase technology, which was around 1999 or 2000. They're sample-based synths (JV's, XP's, XV's, etc) and samplers (MCx0x) probably use the usual techniques.

gmeredith · Post by **gmeredith** » Thu May 08, 2008 5:11 am

"the Roland method"

I mean what you referred to as the D50 interpolation method, sorry.

I think the distinction I am talking about here is that when some samplers produce their different notes the sample plays faster and shorter as you go up the keyboard, so that a vocal sound recorded as the sample will sound like Mickey Mouse the higher it gets; this is the type of sampler I am meaning in my original question.

These are the type that aren't multisampled - they just use the one sample and spread it over the entire keyboard. This is the type of sampler I'm referring to when I asked the original question. And does the ES-1 use this method to change its sample note? (not that the ES-1 spreads its sample across a keyboard, but you can tune a sample part and set its pitch like it transposes the sample at different notes).

Do these type of samplers just speed up the playback rate of the sample to produce different notes across the keyboard range, or do they use the other DSP type method you mentioned? The reason for me asking this question is that it will help me figure out what is going on in this sample trick method I have observed, with the sample quality improving. Sorry for being so painfull with many questions, I'm just thinking out loud!

Cheers, Graham

gmeredith · Post by **gmeredith** » Thu May 08, 2008 6:03 am

OK, after reading through the ES-1 manual, it seems that the pitch tuning is actually a DSP effect, rather than a sample "tuning" control, as I had originally thought. This is not stated in the manual, but the way it is written implies it.

So in this case, it would seem that the "apparent" increase in quality is due to DSP processing. Thanks for helping me get my head around this.

The method I mentioned about getting better quality out of a sampler will only be applicable to the type of sampler that DOES alter the sampler playback rate to change its pitch. The ES-1 does NOT, it seems. It's apparent improvement is perhaps due to DSP shifting the high harmonics of the sound content.

I think the case may be solved. For anyone with an ES-1, try it out and see if you like the flavour of the sound it produces, as it seems to be producing higher frequencies through the DSP process than what was originally in the sample when you pitch shift it up.

Cheers, and thanks for your input.

Graham

fac · Post by **fac** » Thu May 08, 2008 1:32 pm

gmeredith wrote:
I think the distinction I am talking about here is that when some samplers produce their different notes the sample plays faster and shorter as you go up the keyboard, so that a vocal sound recorded as the sample will sound like Mickey Mouse the higher it gets; this is the type of sampler I am meaning in my original question.

These are the type that aren't multisampled - they just use the one sample and spread it over the entire keyboard. This is the type of sampler I'm referring to when I asked the original question.

Multi-samples can also be pitch-shifted, only they are usually transposed less because once you reach a certain note, you're better off using the next multisample.

For example, let's say you sample a piano at middle-C. With only one sample, if you want to play over all the keyboard, you would have to pitchshift that sample by a quite large factor, which will give you the chipmunk effect (plus aliasing and artifacts).

Now, let's say you sample each C note on the piano. Now, in order to play across the whole keyboard, you only need to transpose each sample at most 6 semitones, so it's less likely you'll get artifacts, but you still need to pitchshift.

Of course, if you have tons of RAM, like in a PC, you could sample each key on the piano, avoiding the need to pitchshift at all (unless you want bending or vibrato effects).

gmeredith wrote: And does the ES-1 use this method to change its sample note? (not that the ES-1 spreads its sample across a keyboard, but you can tune a sample part and set its pitch like it transposes the sample at different notes).

Do these type of samplers just speed up the playback rate of the sample to produce different notes across the keyboard range, or do they use the other DSP type method you mentioned? The reason for me asking this question is that it will help me figure out what is going on in this sample trick method I have observed, with the sample quality improving. Sorry for being so painfull with many questions, I'm just thinking out loud!

In some way, the method I describe (which I can assure you is used by the ES-1 and most of the samplers out there) does change the rate at which the samplers are read from memory, but it does not change the output sample rate, which is fixed to 32 kHz. It's not a DSP trick, it's simple interpolation, although I don't know what kind of interpolation the ES-1 might use (my guess is linear, since the quality is so-so).

Now, the trick you describe to "improve the sample quality", well, we would have to see first if it's really improving the sample quality, or if you just happen to like that sound better. To do this, one would have to determine the criteria with which to judge sample quality. For example, why do you mean when you say the transposed sample sounds better than the non-transposed sample? Does it sound more like the original sound, or does it sound even better than the original sound?

fac · Post by **fac** » Thu May 08, 2008 1:35 pm

gmeredith wrote:...as it seems to be producing higher frequencies through the DSP process than what was originally in the sample when you pitch shift it up.

That is probably aliasing you hear, and it's usually an indication of a not-so-good interpolation method.

Now, IMO, aliasing can sound pretty cool, and it's an effect I use a lot to add some grit to otherwise lifeless sounds, but it's basically distortion that is added to the sound, so I wouldn't say that in this case the quality of the sample improves.

gmeredith · Post by **gmeredith** » Fri May 09, 2008 4:21 am

Does it sound more like the original sound, or does it sound even better than the original sound?

More like the original source sound. It sounds more like it than when you just sample the source sound at its normal pitch into the ES-1.

I have a theory about this. The source wav file is at 44k. The ES-1 is only capable of sampling at 32k, so if you sample it normally you're going to lose top end. But if you slow the source sound down to 32k and then sample it on the ES-1, you're going to lose minimal data, or at least, less. Now if it is then pitched up in the ES-1, the spectral frequencies, be it genuinely sped up, or harmonically created pitch-up or interpolated frequencies, will be higher in frequency. So the sound will be more like the original source 44k wave in content than a standard sample of the 44k wave on the ES-1 which is at 32k and lost some data.

I can't say for sure that's what happens, but it sounds like it, when you do an A/B comparison of the trick sample with a normally sampled version of the source sample in the ES-1, and also then compare them both to the original source file.

Do you have an ES-1? If you do, maybe you could do an A/B comparison and let me know if it seems that way to you. Be mindful that you'll need a fairly good set of speakers or headphones to do this, and choose a suitable sample with plenty of top end like a cymbal or chimes or something. I need someone else to confim or deny this, so that I know that I'm not insane

Cheers, Graham

fac · Post by **fac** » Fri May 09, 2008 4:40 am

gmeredith wrote: I have a theory about this. The source wav file is at 44k. The ES-1 is only capable of sampling at 32k, so if you sample it normally you're going to lose top end. But if you slow the source sound down to 32k and then sample it on the ES-1, you're going to lose minimal data, or at least, less. Now if it is then pitched up in the ES-1, the spectral frequencies, be it genuinely sped up, or harmonically created pitch-up or interpolated frequencies, will be higher in frequency. So the sound will be more like the original source 44k wave in content than a standard sample of the 44k wave on the ES-1 which is at 32k and lost some data.

Could you be more specific about what you're doing? I thought you were sampling with the ES-1 from its analog inputs, but now you say the source WAV is at 44 khz. Are you loading the WAV file into the ES-1 via a SmartMedia card?

Also, resampling a 44 kHz WAV file to 32 kHz does not slow down the sample. Are you sure you're not loading a 44 kHz file?

gmeredith wrote: Do you have an ES-1? If you do, maybe you could do an A/B comparison and let me know if it seems that way to you. Be mindful that you'll need a fairly good set of speakers or headphones to do this, and choose a suitable sample with plenty of top end like a cymbal or chimes or something.

I do have an ES-1 (just bought it a few weeks ago), and also a decent set of speakers (Alesis M1's). I used to own a couple rack samplers (emu and akai) but I sold them. I'll try to do some tests this weekend.

gmeredith · Post by **gmeredith** » Fri May 09, 2008 5:45 am

I thought you were sampling with the ES-1 from its analog inputs

Correct. Sorry if I sounded confusing. I mentioned that the source was 44k simply to say that it was CD quality, as opposed to ES-1 quality. I recorded a Boss Dr660 into my computer at 44k sampling rate. I then ran the computer sound card analog outputs into the recording input of the ES-1. I then used this original 44k sample on the computer as my reference sample in the whole procedure.

Although the same could apply if you loaded the wave file onto smart media into the ES-1, but then I wouldn't be able to tell if there were losses associated with the sampling process in the ES-1.

resampling a 44 kHz WAV file to 32 kHz

In this case, I didn't resample, i just tuned the playback pitch of the sample down on the computer program as I sent it to the ES-1's inputs. To eliminate any digital processing interfering in the procedure, though, an analog device with variable pitch can be used, such as a record player, if there is a danger that the tuning down process is actually not just slowing down the playback rate. I might test this later to see.

I do have an ES-1 (just bought it a few weeks ago), and also a decent set of speakers (Alesis M1's).

Please do! This is driving me nuts! Am I going crazy and hearing things?!!

cheers, and thanks for your patience and advice!

Graham

Korg Forums

ES-1 technical question re: sample rate

ES-1 technical question re: sample rate

Re: ES-1 technical question re: sample rate