Monday, December 29, 2014

chipspeech Diary, Part 2


Last summer I spent a good part of my late night research time on Texas Instrument Voice Synthesis Processor (VSP) LPC10 decoding chips. TI made LOTS of these from the late 70s to the late 80s, both for the OEM market and for its own use, under different names. They usually prefixed them TMC or CD for internal use, and TMS for OEM use, but their naming was not always consistent, which leads to lots of confusion.

The goal for chipspeech is to playback LPC10 streams coming from any Texas Instrument device/software and to do so in a sample accurate manner.

We are very close to that goal, but since I found no less than 8 variants of VSPs - all of which had to be accounted for in my emulation core - I'm convinced in fact, that there may be more.

A variant is a unique combination of the following features:

  • Pitch encoding format (5 or 6 bit indexes)
  • Energy ROM look-up table
  • Pitch ROM look-up table
  • K (lattice filter) ROM look-up table
  • Voiced chirp ROM look-up table
  • Variable frame size (TMS5220 rev A and up)
Those are the main characteristics which can quite significantly alter the tonal quality of the output voices. For instance, if you play a stream made for a variant on another variant you could get anything from a slightly unusual nasal quality playing certain words of phonemes, to a completely alien-like version of it, to typical circuit bending whooshes and 'blurps'.

We could have created even more sub categories to account for other observed behaviors, like for instance what happens when an unvoiced (noise) frame is followed by a silence frame (Energy either fades or not). There are probably even more subtle differences especially considering the influence of the various internal interpolation stages 'shifts' depending on the chip - but nothing that you could easily hear I am willing to bet.

The following is a list of VSPs found in devices that we have acquired and dismantled (some of which are incredibly rare). This list is by NO means complete as we do not have infinite resources... nor storage space :).  And I do plan on revisiting this as new discoveries are made.

TMC0280/CD2801 were found in:
  • Language Translator/Language Tutor (same thing)
  • Speak & Math
  • Speak & Read
  • Speak & Spell (UK, that's why the US carts sound funny with it)

TMC0281/TMS5100  were found in:
  • Speak & Spell (US - Buttons)
  • Dictée Magique (French Speak & Spell)
  • Speak & Spell (Japanese - Buttons)
  • Century 'Dazzler' CVS arcade PCB.

TMC0281D/TMS5100A  (Energy table changed from the previous) were found in:
  • Speak & Spell (US - later membrane)
  • Speak & Spell (US - Compact)

TMS5110 (A/C) Found in:
  • Stern/Valadron Bagman PCB
  • Chrysler Electronic Voice Alert
  • Coleco Talking Teacher/K28 Talking Learning Computer (not the Votrax one)

CD2802 Found in:
  • Touch & Tell
  • Vocaid

CD2501E/TMS5200 Found in:
  • TI99/4A Speech Adapter

TMS5220 Found in:
  • IBM PCjr Speech Adapter

TMS5220(A/C) Found in:
  • Magical Wand Speaking Reader (A rev)

How I categorized the variants:


A) Made a full regression test suite of LPC10 streams, each testing a different edge case. I send the streams to the VSPs, record and compared results.



B) Dumped the internal ROM tables of the chips. When idle, VSPs serially output the current values for Energy, Pitch and K's on its PROMOUT pin.  We can thus read the internal values of the internal ROM tables again by crafting custom LPC10 streams where each and every possible values of Energy, Pitch and K entries are set in succession. But we need a different stream for 5 and for 6bit pitch variants, of course!

Logic analyser dumps of ROMCLOCK,T11 and PROMOUT running these streams were then analysed with custom code, then text editor mangled and compared against each other. (Thanks to Lord Nightmare for the explanation of the PROMOUT logic!)

Each TMS51XX/5bit style chip was placed on a Stern Bagman PCB with a customized EPROM @ 9T, offset 0x1032, BIT6 "channel", replacing the French "Aye Aye Aye", (death sound) with the custom LPC stream. Since there are two form factors for those chips (DIP and SDIP), a ZIF adapter was used when needed:



Each TMS52XX/6bit style chips was placed on a custom made protoboard with custom FTDI programs to send the LPC10 stream using the 'Speak External' command


Portions of this research was contributed to MAME/MESS and hopefully will be used for the introduction of the various Speak & Spell drivers that they are working on.

Thursday, September 18, 2014

chipspeech Diary, Part 1

In case you haven't noticed, we take hardware research and emulation very seriously here at Plogue.

-We never take any information for granted, whether its from official datasheets, patents or third party research.

-We always double check and investigate what we do on hardware: creating custom tests suites for each and every chip, sending values and capturing the results digitally.

-We then create models and iteratively refine them as we add new tests, often for virtually every possible edge case there is.

You can imagine that this is a long, VERY long process. We sadly never know when our products will ship because simply knowing how an integrated circuit works in and out does not magically make it a product! (stating the obvious)

We started this project 8 years ago but it really started kicking into speed in the last 3. Back then there wasn't a single 'speech chip' plugin out there(at least I think). And there is now a handful of them, and STILL we are not ready to party with the rest of them.

We hope that this new blog series will help you wait a little bit longer for what we hope will be the one that sets the standard, like chipsounds did.

PART1:

It all began with gathering, or more like compulsive hoarding of nearly every vintage consumer device that talked. We set our start point to 1975,  the date where the first ever speech synthesizer IC reached the market, the TSI Speech+  up to the dawn of the 90's where essentially everything went boringly intelligible. 

How many devices are we talking about?
Quite a lot actually.



Thursday, September 11, 2014

Plogue livenes

Is a Nintendo Entertainment System "homebrew" application that I've developed in order to improve the emulation of the RP2A03 for chipsounds 2.0, which is currently in development.

It allows you to change the values of the APU's memory mapped registers ($4000 to $4017) using nothing but the Nintendo d-pad.

A side effect is that it can also be used to generate live minimalistic 'music' on a NES by manually toggling a bit at a time, which is of course completely unintuitive!

Changing the pitch value for a specific channel on a musical scale implies changing multiple bits at once, something that is clearly impossible here.

As I like a challenge, I tried to see if I could make something remotely musical out of this incredible restriction set. The following piece was recorded live (not sequenced in any way) on a real NTSC NES:



Note:
A)The main DMC 'sample' that starts the piece is actually the application code and graphics being interpreted as Delta Modulation.
B)My NES is stereo mod-ed, so there is a slight touch of post mix and reverb, but that's it.

If you want to try it our for yourself you can download the latest .nes ROM here

Revision history:
1.1 Fixed the wrap around on the lower part of the screen
1.0 Initial version

How can you run this on a real NES and not just in an emulator?

1)Put it on a Powerpak
2)Make yourself a nice UNROM (Mapper 2) dev cartridge out of one of those carts
(mirroring is irrelevant). I won't get into the details of that, but here's what mine now looks like:

Thursday, October 10, 2013

chipcrusher re-sampling vs frequency response

Quite a few users have complained about chipcrusher's peculiar 'dry' frequency response compared to what they get with other common decimator plugins. This post hopefully will explain a few things.

Lets say we bypass the bit reduction, distortion and post filtering and only concentrate on the task of downsampling the plugin's input signal. which would be say at 96kHz. and that chipcrusher's re-sampler would be at 44.1kHz, its internal maximum.

There are two important aspects to consider:

1)Typical results achievable using a vintage sampler is very different from 'your typical Bitcrusher VST'.

99% of bitcrushers/decimator plugin out there use the same tired algorithm that was posted more than 10 years ago on musicdsp.org. This method does NOT band limit the input signal prior to the downsampling, it just sample and holds using a counter... any sample!

This is not what classic samplers did. Any engineer with half a brain at least tried to filter analog audio signal so it wouldn't contain harmonics over the Nyquist frequency of the target sample rate!. If you skip this pass, you will get extra aliasing all over the spectrum.

2)Not all lowpass filters are created equal.

All versions of chipcrusher prior to v1.005(available soon) used a CPU friendly downsampling setting which - in retrospect - might not have suited everyone's taste since it was not steep enough for high frequency content.

You can see chipcrusher's default precision somewhere in this animation made using 96kHz -> 44.1kHz with a white noise as source. All the other settings will be available. We have added a new 'Precision' parameter to set the steepness/cpu use ratio you desire. BTW The first picture in the lot is from a "do not pre filter" setting, we offer 6 such settings, from 6 point spline to truncation. Aliases like crazy, but to each is own.

Saturday, June 29, 2013

Making arcade cabinet impulse responses.

Here are a few pictures we took while capturing impulse responses for chipcrusher in late 2011...
Sorry for the mess, our office is a perpetual hardware dismantling lair.

Thursday, June 6, 2013

GBA SP Speaker Impulse Response



Looks fun? This is what I did for each and every speaker impulse in chipcrusher's Post Processing section.

The goal is to capture not only the frequency response of the speaker itself, but also the effect of its casing and internal components: resonance, cancellations etc.

Its thus very important to make sure to properly close the unit (which can be complicated by the tight confined space) with your soldered speaker leads dandling out without changing the tonal balance of the unit.

A few carefully created test tones are then 'injected' through the leads and recorded with one or more microphones in a mostly anechoic space at a few inches from the device. 

Next the recordings are processed with custom software.

Once thats done, and we are sure the recorded IR data is valid, we need to do the inverse: reopen, unsolder,
close and make sure it works. While the microphones are set up I usually also record native console sounds (games or test code), through that same setup, for later comparison.

Luckily no unit were destroyed, and everything worked just like it did before.

(But you can imagine the stack of devices that I have in the office and in my basement)... there is a psychological condition for that, and also a TV show about it.... Rest assured I ONLY keep tech stuff. :)