news

Machine for Random Mixing Voices like the Mailchimp Ad (and doing the Fisher-Yates Shuffle)

admin

For whatever neurocognitive reason ideas just linger on the periphery of my mind, fading in, fading out, but maybe just waiting for the right moment.

As typical, I start writing the backstory and the process and the development of something I made, then bunches of code tossed in, before you even see it. In the interest of my old rule of Start with the Damn Demo, this post is the path leading to this quasi experiment.

It Started With An Advertisement

The original Serial podcast was memorable in its own right, but another thing that did that mind floating thing was this advertisement for Mailchimp.

A story behind this 20 second segment gets at the understated power of audio:

It’s a sly takeoff on the aural aesthetic of public radio reportage mastered by shows like This American Life: Hearing—but not seeing—each person forces you to picture the scene. Subtle touches, like the woman at the end who says, “I use MailChimp,” at a distance from the microphone, adds a richness that’s only possible on radio.

How MailChimp’s irresistible “Serial” ad became the year’s biggest marketing win (Quartz)

If I ever taught a media class again, I would create an assignment to produce something like that. It would take the effort of soliciting (ideally) strangers with distinct voices on the street to recite the segment, and some thoughtful editing on piecing together the voices.

OEG Voices Intro

When I started a year or more ago for Open Education Global (adjust for pandemic time distortion) the OEG Voices Podcast, the goal was to bring you conversational style, the voices, ideas, and just personality of open educators. The Mailchimp type track seemed appropriate for the introduction.

The version used up til now was made by having all of my colleagues at OEGlobal record a two paragraph bit of copy which I manually edited together, and the one track has been used in the first 26 episodes. It’s quite okay.

But if you have read much here, I like having elements of interesting (or at least aim for that) randomness in my work. So what I really wanted was an easy way to make the intro change, but more than that, have a way for anyone interested in doing so to record the content and send my way.

And what I wish I had done all year long, I saw that I could ask guests on the podcast to record a version at the end of a session (if they did not run out screaming “let me out”).

I put out a call in our OEG Connect community with the text of the introduction, an asking them to record and send me anyway possible (email file, respond to the post, a thumb drive strapped to the leg of a carrier pigeon).

Hello and welcome to OEG Voices, a podcast bringing to you the voices and ideas of open educators from around the world. OEG Voices is produced by Open Education Global, a member-based, non-profit organization supporting the development and use of open education globally. Learn more about us at oeglobal.org.

There’s much to take in at a global level. We hope to bring you closer to how open education is working by hearing the stories of practitioners, told in their own voices. Each episode introduces you to a global open educator and we invite you to later engage in conversation with them in our OEG Connect community.

from Contribute Your Voice to OEG Voices Opening Segment (OEG Connect)

One of the best little web tools to suggest is Vocaroo, where you can record audio and share via URL without having to make an account. As usual, I find many people have never heard of it and get excited about using it for other purposes.

I always had in mind (one of those things again) that beyond just collecting and using the audio, to build a web tool that could generate randomly shuffle mixes of the voices. I had thought this might be done all in browser script, maybe generating dynamic a .m3u playlist which I thought I had seen could be converted to an mp3.

I ended up with maybe 8 different ones, and just to try it out, I mixed and shared a first version manually in Audacity (not the way I wanted to do in practice).

This was also a means to test the way I would split up the recordings, chopping each recording into what turned out to be 14 segments (I learned from my first versions to mix long and short, and not have too many cuts in mid-sentence). These are the segments:

  1. Hello and welcome to OEG Voices
  2. (OEG Voices repeated 3 times)
  3. a podcast bringing to you the voices and ideas of open educators from around the world
  4. OEG Voices is produced by Open Education Global
  5. a member-based, non-profit organization
  6. supporting the development and use of open education globally.
  7. Learn more about us at oeglobal.org.
  8. There’s much to take in at a global level.
  9. We hope to bring you closer to how open education is working
  10. by hearing the stories of practitioners,
  11. told in their own voices.
  12. Each episode introduces you to a global open educator
  13. and we invite you to later engage in conversation with them
  14. in our OEG Connect community.

Segment 2 tries to mimic the Mailchimp by having the “brand” name repeated in multiple voices.

Slicing and Naming the Audio Pieces

Splitting the recording is manual, but that is how craft works. There’s no bots in this shop.

Because you will get audio at various levels and quality, my steps are similar to the epsiode entry.

  1. Convert the audio (which you may get as mp3, m4a, ogg to .wav I have been using a drag and drop converter called To Audio Converter Lite. You can find a gazillion apps for this, I just want one to use on desktop. The Lite version lets you do only one at a time, so I end up quitting and relaunching if I have a batch.
  2. Why bother? I need with .wav or .aif to run the audio through the magic of The Levelator app. This runs a few filters (normalizer, limiter, magic goo) to even out audio that was recorded under different conditions, input levels. This is the OSX version of the original version created by the Conversations Network, one of the grand old former podcast venues. I’ve used it faithfully since the mid 2000s.

I have a master Audacity file where I import a new track. I have a labels track so I can see which segment is which, It’s really a matter of finding the breakpoints, hit command-I (or Edit-> Clip Boundaries -> Split) and then use the Time Shift tool to space them out.

Next, I select each segment and use File -> Export -> Selected Audio to create 14 mp3 files.

I gave some thought to a file structure knowing a bit how I hoped to put them together. I have a directory structure of folders named segment-1, segment-2, … segment-14. that contain the different people’s recordings of the same segment. For each person, I use a single word “id” (their first name), so for me, I would have files stored like segment-1/alan-1.mp3, segment-2/alan-2.mp3, … segment-14/alan-14.mp3

Also, I knew ahead of time, the recordings would not always be one person have 14 segments- when we have multiple guests, I might ask them to record in alternating pattern. Or if a segment was not clear I needed to be able to skip it.

Building the Mixmaster

It probably has been like 15 years since I made an .m3u playlist, the beauty I thought was they are just text files. The problem I found was a lack of web-based players I could embed. The old ones depended on Flash (hah), and the venerable JWPlayer I used long ago seems to be commercial only. C’est la web.

I figured that had to be a way to do this all in Javascipt / jQuery, to play a series of separate audio files sequentially. The basics of it I found in  a blog post by “pro9ram” that did it in a simple fashion, create a series of Audio objects in JavaScript . It adds a Listener event to trigger when a file ended. a global counter was used, and on each end of a segment is bumped one. If it just played the last one, it is done,otherways it sends a play command to the next audio object.

I got a crude version working quickly with manually defined list of files. At least that ensured it would work.

Next came designing a means to store the data, and some logic doing the random shuffling. And this was all before doing any design/pretty work.

I knew I needed some kind of JavaScript array structure to track 3 items- the “id” for a person (to build file name), their full name (for display), and an array of which segments I had recordings from them. Oh was I rusty on this subject, and a most hopeful find was from Free Code Camp- JavaScript Array of Objects Tutorial – How to Create, Update, and Loop Through Objects Using JS Array Methods.

As it turns out I need a standard array I could step through, each contains a javascript object (more or less an associative array). As a sidelight, you can find yourself deep in corners of StackExchange where the debates over nomenclature flare up fiercely:

Chris says- You should mention there is no such thing as associative arrays in javascript - Marc counters -
Wrong. In javascript, all arrays are associative arrays. The Karl makes peace - As far as I was aware, associative arrays and objects are basically the same thing in JavaScript. They're just different ways of referring to the same thing.

I declare all the data directly by defining an array (when I get really good, this should be json, but that’s another day):

var people = [
	{"id" : "alan" , "name" : "Alan Levine", "segments" : [1,2,3,4,5]}, 
	{"id" : "wilma" , "name" : "Wilma Flinstone", "segments" : [1,2,3,4,5,6,7,8,9,10,11,12,13,14]}, 
	{"id" : "felix" , "name" : "Felix Dog", "segments" : [2]},
	{"id" : "ollie" , "name" : "Ollie Cat", "segments" : [2,4,6,8,10,12,14]}
];

The main part of building a random sequence is looping for each segment I have audio for (14 of them). For each segment, I can create a temp array of all the “people” objects that, for say segment m=6 have that number in their value for “segments”:

inthemix = people.filter(inthemix => inthemix.segments.includes(m));

For this limited basic silly example you can scan visually that only Wilma and Ollie have audio registered for segment 6:

I then want to do some routine to pick one of these people at random. For previous random generators (e,g, The EdTech Metaphor Generator) I used this function to return a random index for an array passed… or for an array of 10 items, one time it might return “8” the next “3” etc.

// general purpose function to return a random index to input array
function random(array) {
  return Math.floor(Math.random() * array.length)
}

Now if you want to go deep in StackExchange programmer flurry holes, follow the arguments about random function being not so random. It never ends. But for my first tests with maybe 15 people items, it seemed to work okay

I did come up yesterday with a weird situation when I added 2 more people data elements to the mix. No matter how many remixes I regenerator, one name just would not show up. After checking everything for typos, and setting some test breakpoints, I decided to look for maybe a better function to generate a random number.

Thus, this is where I learn from a Dev post How to shuffle an array in JavaScript a new dance step, the Fisher-Yates Shuffle. Yes you can learn it was coined from a 1938 paper by… Ronald Fisher and Frank Yates. Here is a moment to appreciate the Ways of Wikipedia, by offering a step by step mains to understand what is an algorithm, via a pencil and paper process.

Regardless, the code example listed in the Dev post worked like a champ, passing in an array like the inthemix example above, the function shuffles it, and then returns the last element.

function getrandom( array ) {
	// Fisher-Yates Shuffle h/t https://dev.to/codebubb/how-to-shuffle-an-array-in-javascript-2ikj
	for (let i = array.length - 1; i > 0; i--) {
		const j = Math.floor(Math.random() * (i + 1));
		const temp = array[i];
		array[i] = array[j];
		array[j] = temp;
	 }
	 // return the last element of the shuffled array
	 return array.pop();
}

It worked like a beauty, and the randomness started looking more even. Earlier, with the built in JavaScript Math.random() function it seemed to favor lower elements in the source array.

I won’t bore the maybe one person left with detailing all the other code bits and iterations, It’s all in the mostly commented index.html file in github. I spent time adding in play/pause/remix functionality, and once slapping on a bootstrap theme, creating some dynamic displays of the current voice, and all the people to credit.

There is potential for bugs in there, but the functionality to generate and hear the random mix is working.

The second segment was tricky as I needed it run a special case of segment 2, where I needed to get 3 random voices from the selection, rather than one.

Stitching Together Audio Files

This whole exercise is meant to provide a public view…er ear to generate random remixes of the 14 segments, but a key is I need a way to merge them together into a single intro track I can import into my Audacity template.

This is eminently doable with server side scripting, and I considered a php approach. I found a few free web apps that do it nicely like Audio Joiner and another from Clideo. But that’s a bit manual, I really suspected there would be a desktop app or command line judo I could do.

Because thar be unix running on my MacBookPro, one can appreciate the affordance of the humble cat command that can string together files of the same time (not only audio I learned, like putting together three mp3 files into one mix

cat somefile.mp3 anotherfile.mp3 yetathird.mp3 > mix.mp3

And while it did work, the final file had some funkiness to it (like its run time not being accurate). I had a hunch it was problematic. As it turns out, the magic is easy using the media workhorse, the open source FFmpeg. You have to respect something described:


FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the cutting edge. No matter if they were designed by some standards committee, the community or a corporation.

http://www.ffmpeg.org/about.html

The magic I was seeking is in the documentation for concatenating media. And what works even better, and is my solution, is you can provide the file paths to all the source audio as a text file in simple format.

For my example above, this command works

ffmpeg -f concat -safe 0 -i voices.txt -c copy mix.mp3

If I created a voices.txt file with the structure:

# Comments like this are ignored! But can be useful
file 'somefile.mp3'
file 'anotherfile.mp3'
file 'yetathird.mp3'

All of this is easily generated in my mix routines. You can find it nestled in an toggle box at the bottom labeled “For the Mixer” (I am “the” mixer)

This will not do much for anyone without the source audio files (well they are there in github) BUT… if anyone creates a fantastic mix, they can copy the text provided, or one step better, they can download a voices.txt file that has everything. I will do it myself to make a mix.

All I need to do is generate a new mix file for each episode.

That was a lot of work to make and more than that to blog to create a 45 second intro. But oh this is the stuff I enjoy.

Already in Play

On Thursday this week I ran a session at OERcamp (what a great unconference format) where I recorded a conversation with organizers Kristin Hirschmann and Jöran Muuß-Merholz plus a few conference participants. I will be hopefully publishing next week as a new OEG Voices podcast.

Kristin and Jöran were willing and gracious enough to agree to each record one paragraph of the intro, and their voices are now in the “machine” available for remixing. I used the mixer to make an intro I will toss into the editor, and thus they both appear in it (sneak preview):

Do you want to be in the mix machine? Just send me an audio file recording of the intro (see above) or post a reply to my call for voices in OEG Connect.

Play with the mixer! Have fun.

I’m rather stoked to get this idea out of the brain float zone. And always always always, this process of figuring out these small tools always teaches me more things I use later.

This is still pretty much a tool specifically honed for my need, but quite possibly it could be mad more generalized. Maybe SPLOT? There goes the idea puff again.


Feature Image:

Closeup of Music Studio Mixer Console Knobs
Closeup of Music Studio Mixer Console Knobs flickr photo by wuestenigel shared under a Creative Commons (BY) license

oeg-voices-intro-v2.mp3 (489kB)

voicesmix.mp3 (557kB)


Older Post Newer Post