| Contents | SOUND COMPRESSION: Discussion and Resources Saturday, April 01, 2000 |
|
This began in VIEW and Mail, and quickly grew. I've copied it here to keep it all together, and perhaps put in some continuity. Rather than change the continuity of View and Mail it was COPIED here, so much of this is duplicated. I will mark the "Brand New" parts. |
|
Roberta has been recording the sounds for her program. The program is TLC The Literacy Connection (see www.readingtlc.com for details, or there's some stuff about her program on this site also.) The DOS version required a tutor: a human, who can be anyone of any age and education as long as the tutor knows how to read, who reads the material presented on screen, both instructions and the lesson materials themselves. Since the program teaches structured phonics there's a fair amount of spoken text here. The Mac version uses the Mac Text To Speech (TTS) capability, so no tutor is required, and it has been used with kids as young as 4 and grownups as old as 60; we don't know of anyone who has been through the whole program who did not learn to read English, and by read I mean read any word in the language including "big words" like Constantinople and Timbuktu and for that matter antidisestablishmentarianism. Some have said that the MAC TTS isn't good enough and is too mechanical. Meanwhile we haven't found a Windows TTS that sounds good enough, which is a major reason why we don't have a Windows version. So, we got the notion, record all the lessons in .wav files, and put the program out on a CD. So for the past day or two she's been down there using Sound Recorder to record each phrase. There are a lot. My notion was that she use the first few words of the text as the file name; that way, we take the Mac program, and when the program hands the phrase off to the TTS program we intercept it, use that as the file name, and return the proper wave file. That should make conversion simple. The first lesson plus the general instructions for the program take up 49 files and 47.5 megabytes of file space. There are 75 lessons, so we are talking about at least 3 gigabytes, which is 5 CD's. There has to be a better way. I don't know what the compression ratio of wav files is, but maybe ZIP will do something here. That will mean that as each lesson loads it has to unzip itself, meaning that the user will need 50 megabytes of disk space each time he uses a lesson, and that looks like a problem too. I am no expert on sound files and sound compression, but I'd be astonished if there weren't a slightly better way to do this, including compression of the sound and on-the-fly decompression as the program calls for the sound. But we'll see. Suggestions welcome. MORE on Sound: Had dinner with Alex, and he's making some changes in his mother's system largely so we get better input quality. We'll use Sound Forge to edit when we're done. We will also investigate RealAudio, which works on both Mac and PC, and which claims to be able to get hours of voice on a single CD. We need to know what their licensing system is, and about distributing their player since many of the school districts that want Reading, The Literacy Connection are likely to have old equipment. Thanks for all the help, and we'll look into all the suggestions made. I sure got a lot. I'll put up a sample of the letters in mail.
On Sound and thanks! roger@pswtech.com WAV files are not compressed. This is going to cause some grief for Roberta, depending on a few things: 1st, what resolution is she recording these WAV files in? 44.1 KHz, 16 bit? This is CD-quality audio and takes up a lotta space. Its also overkill for spoken word, since the human voice falls into a pretty small audio spectrum and doesnt need all that space. You can save some space by using 22KHz and even more by dropping to 8-bit, and a lot more by dropping to 11.5 KHz. Alternatively you can record all the files at 44.1/16bit and then use a number of audio editing/conversion programs to drop them down to the desired rate. I assume she is recording in mono? Stereo tracks will take up twice the space and again are not really necessary for spoken word. Unfortunately all my audio program experience is on the Mac, but the two programs that I have heard good things about for recording and maniupulation are Sound Forge and Cool Edit. Cool Edit is available as shareware, and is also available in a Pro version. http://www.syntrillium.com/cooledit/cool96.htm
You could also muck about with different audio file formats, but since WAV is supported natively in Windows and nothing else is, youd have to have some sort of audio playback engine within the program to play back something like an AIFF or MPEG audio file. These files can be compressed mightily small, but do need something to read them. --
I was vaguely aware of all that, but thanks for putting it succinctly. We will use wav files because they are supported by nearly all Windows systems. But recording in 8 bit mono may tame them down a lot. Actually we'll continue to record in CD mode and use some kind of software to crunch them I think, and see how that sounds. THANKS! Then This came in, and went into View: On recording audio: Robert Morgan [rmorgan@openface.ca] I wasnt going to send you a message on the subject of recording audio because I felt sure you would be deluged. The answer to you problems is MP3. Thats Mpeg Audio Compression, Layer 3. Ive been using it to put all my favourite audio cds on to my hard drive for easy listening. A full, 65 minute audio cd compresses to about 50 megs, and thats compressed at equivalent-to-cd quality. You can compress more if you like. The nice part is that Microsoft is on the MP3 bandwagon, and if youve installed Netmeeting ? or Netshow? not sure which then the MP3 decompressors were installed on your system already. This format has taken the Internet by storm since it makes it possible to send high quality audio over relatively slow modem links. I recommend you look at www.mpeg.org for a starting point. There are free and commercial encoders available; I bought a commercial encoder from Xing for $19.95 (www.xingtech.com/products/mp3encoder). You can continue to use soundforge to capture your raw audio, then use Xing to postprocees the wav files. Amazing. I was calculating that I could put all 200 cds I own on one 8 gig hard drive. - Robert Morgan Thank you. We will look into that. I haven't been following this: last I saw MP3 required programs and even hardware that weren't universal. It certainly sounds as if an MP3 version is at least one right way to go, and perhaps the universal one. Do you know if MP3 works on a Mac? This program should work on both platforms; many schools use the current version (see Roberta's reading page for details) in schools now. We have been considering Real Audio. I blush to say I don't know what technique they currently use for that. The player is apparently freely distributable, meaning we could include it with the program itself at no extra cost. I will look into mp3 and thank you for the details. And more: Divya Mahajan [divyam@mailandnews.com] For Robertas program, you could look into using the MP3 format too. It retains CD quality at a cool 10 to 1 compression. The drawbacks? it takes a while to encode. But you could easily leave it as an overnight task. MP3 players exist for all platforms, and there are a lot of freeware players around. For Windows, WinAMP is the undisputed champion MP3 player. Is it good? Ive got 9 hours of CD quality music on a SINGLE CD. That clinches it for me. It should be the next generation for music distribution, but I guess music companies wouldnt like it at all. It isnt easy to copy 600MB (if you wanted to copy a CD on your disk), but 60MB is no big deal. So it gets radically easier to pirate and distribute music on the Net. (Search for MP3 download on a search engine. Youll get a lot of sites) IMHO comparing RealAudios quality to MP3 quality, would be like comparing an old scratched up phonogram to a CD. Warm regards, Divya Mahajan Thanks. That answers two questions, quality and platform. Is there a low cost player that can be distributed with the program itself? For both PC and MAC? I'll find out, I guess. As to quality, we're only looking for "good enough" not concert hall quality, but of course you can't have too much fidelity if it's free. Thanks again. === <<Thanks. That answers two questions, quality and platform. Is there a low cost player that can be distributed with the program itself? For both PC and MAC? >> Interesting how our explorations seem to run in tandem. A couple of days ago, I found something on a web site that required an MP3 decoder. There was a link to a freeware decoder, but the link was broken, so I went over to Stroud's Consumate Winsock Application List site and searched for MP3. I found that a highly rated free Microsoft program (Windows Media Player) was available that allegedly would decode any number of standards, including MP3. I downloaded and installed it, tried to play the 2.5MB MP3 file I'd downloaded, and it blew up with a message that the format wasn't supported. Because the MS player explicitly lists MP3 as a supported format, I'm not sure what's going on. Perhaps the data file was corrupt. At any rate, Windows Media Player is a free download that looks like it may do what you need to do on the Win platform. Bob Robert Bruce Thompson Sorry, forgot to put the URL in. http://www.microsoft.com/windows/mediaplayer/default.asp
In yesterdays message I didnt make it clear that you dont even need a player if youve installed Netshow/meeting. The MP3 file can still be a WAV file playable natively by Windows. The MP3 decoding is installed as a codec; check in control panel... multimedia... devices... audio compression codecs and youll see Fraunhofer IIS MPEG Layer-3. Winamp is the defacto player in the Win95 camp; Microsoft only released their codec this spring. Writing messages to you is self-rewarding when I see myself published on your site! Thanks! - Robert Morgan ===
Michael Smith [emmenjay@zip.com.au] At 10:28 AM 3/10/98 -0700, you wrote:
Depends what you call universal. Real Audio is a proprietry format. You need Real Audio software to play it, and I dont think thats free (though you may know more than I do on this). MP3 is simply a standard. There are lots of packages to play it, and its possible to create your own if none are suitable. Youre not locked into a single vendor. Real Networks are firmly in Microsofts sights for "search and destroy", (or maybe that should be "aquire and wind down"). If they succed (and they usually do) you may be left with an unsupported package. Anyway, thats my opinion. (For whatever it may be worth :-) Michael ===
Jason Skomorowski [cjml@earthling.net] The problem with things like RealAudio and Shockwave and such is that you have to pay to record things in those formats and to listen to them you have to use a platform that the company making the standard has decided to support. MPEG (Including the 3rd, audio layer MP3) offers some of the best audio and or video compression (1 minute of CD quality sound is about 1Mb, maybe less). Because its a standard that is open for anyone to use, there are a plethora of freeware and shareware MPEG and MP3 players for a huge variety of platforms. Winamp (Win95) and X11amp (Linux and other unixes) are the most widely used MP3 players and tend to work rather well. Their page is at:
K-Jofol is a more advanced MP3 player than Winamp and also happens to look niftier as well as support another highly compressed format. This format called .VQF and I know nothing about it, but it may interest you that it exists. K-Jofol has a beta version available and theyre working on a Linux/general Unix version. Their site is: http://www.aegis-corp.org/kjofol/
Of course, if youre writing software, youre going to want source code of something that will decode an MP3 and play it within your program. The following page has sources for a number of things, including Mpg123, the most widely used decoder that is the base for a number of MP3 players and MP3 using software. They also have stuff there about MPEG-4 (MP4?) and AAC, some other highly compressed audio formats about which I know nothing. http://freeflight.cockpit.be/mp3tech/sources.html To make the MP3 in the first place, you need to already have the sound stored as a .WAV file or some such. Then you use an MP3 encoder to turn it into an mp3 .. http://www.dailymp3.com/noframe.html lists a bunch of encoders, players, decoders and other things that you may need.. When making MP3 files, you can set a numeber of things that impact the quality and size of the file: how much information is read per second while playing it, the frequency, how many bits of sound to transfer at once and whether it is stereo or mono. The standard for MP3s is 128k/sec, 44.1kHz, 16bit, and stereo. This combination produces a fairly small file while retaining near CD-quality sound. If that doesnt make it fit on 1 CD (Typically, MP3s are a good bit less than a tenth the size of the wav file youre converting from), you can lower the quality a bit. As for macintosh stuff, a quick search reveals the following: Macintosh MP3 page: http://www.engineering.ucsb.edu/~mpiatek/mp3/index.html Macintosh MP3 Information:
Playing MP3 Files on the Macintosh
I found another page that has a bunch of links to even more MP3 stuff (including legal bits like licensing issues) while searching for Mac things .. : http://www.mpegtv.com/~tristan/MPEG/mp3.html
There ... what started out as me setting out to mention K-Jofol ended up being a rather involved message. Hopefully you find it of some use,
Thanks. I need to have stuff collected into one place, and I sure don't have time to do all the research I ought to be doing. It's great that you and others are doing it for me. Thanks! And do keep it up ===
Spencer K. Whetstone [spencer@dgandf.com] Dr. Pournelle, I am surprised that QuickTime hasnt been mentioned. QuickTime 3 is crossplatform and free. I am not sound expert, but I performed the following experiment. On a Mac I imported a track from an audio CD to a QuickTime movie. That uncompressed file is 11,628,099 bytes. I used Movie Player to export a new QuickTime movie at 11.025 khz using the QualComm Pure Voice compressor with a 9 to 1 compression ratio resulting in 1,277,240 bye file. Using the 19 to 1 compression ratio created a 623,354 byte file. Seems to me that you could your sounds in QuickTime and have both your Mac and PC front ends use the same source sounds. I am sure Spencer K. Whetstone Actually, QuickTime was the first thing we thought of. It just didn't get mentioned in the discussion, and should have been. This is all pretty new to me. IN the old days I'd simply have asked the BYTE people in New England, and geniuses like Tom Thompson would tell me all about it, and you'd never see the struggles I go through to sound like I know what I'm doing. But eventually I get it all right; it's just that here the works are showing. We're looking into QuickTime and various Mpeg methods; right now what we want is a good way to get good quality sound recorded; we can manipulate that later. October 4, 1998 This is a good time to summarize where we are. We have upgraded Joizy, Roberta's Gateway 2000 system with a new sound card. We've got lots of suggestions on what to do with the sounds once we have them recorded. Now we're looking into ways to record. Ideally we can use her laptop, or her computer, to make files directly. The goal is to have spoken text recorded under the same file name as the string the program handed to the text to speech converter in her original Mac program; that can be intercepted and the proper file called and the sound played, only this time it will be in Roberta's voice rather than in Agnes's voice. We only need "good enough" sound. It will be a female human voice. We don't have a sound studio. Her office isn't all that noisy but there are fans and printers and the computer itself, so we may have to go to something else if we can't figure a way to filter the noise. The first step is a good powered microphone (one with a pre-amp) so that the signal is so large that the noise may be literally lost in the noise. It's not as if we are going after concert quality here. I am no sound engineer, and unfortunately the sound engineers I know suggest systems that would be wonderful if I had several thousand dollars to invest in this. Maybe one day but for now what they call "professional quality" is more than I want to pay for. I'll settle for 'good enough amateur' once I find out how to do that. We can probably use a better recording program than the one that's built into Windows. There will be several thousand sounds, and we would like to get all this plus the program itself (a couple of megabytes) on one CD, preferably a CD that can be used for either a Mac or a PC although the first goal is for a Mac. I am now convinced that compression of our good enough recordings is possible and indeed there's an embarrassment of riches to choose from. We'll make that choice when we have some sounds recorded. Now back to the mail. As far as a mic goes, you might consider using something similar to a receptionists headset. The are designed to filter all sound except for the person speaking and are hands free as a bonus. You can buy one that plugs directly into your sound card, I believe they call them "internet headsets" mainly for use woth internet phone apps. I wrote before about MP3s and I am a great believer in this format. But I just re-read the entire disscusion and noticed that the software would have to run on "slow" machines. Depending on what your definition of slow is, MP3 might NOT be the way to go. I own a small computer sales business and I enjoy the format so much that I tend to install an MP3 player on most of the new machines I sell(I know M$ has a player that will work, but I think the small guys have better players). Some customers of mine have tried to run MP3s on their older machines and werent able to. A little research showed that the MINIMUM requirement was a 486/100 as far as I can tell. But in practice, the lowest configuration was an AMD 5x86/133 P75 mixing down to mono. If you think the program needs to be able to run on machines slower than that, I would go the way of Quicktime. Just in case you lost the link, www.mp3.com is the most comprehensive sive on the subject.
|