Making a podcast

Contents


Goals & excuses

A lot of podcasts release an almost unedited episode every week, but my goal was different. I wanted to hold each episode to the same standard as a conference talk: a useful, self-contained, properly constructed document of something interesting and unique.

I have no prior experience with audio recording or editing, so I’m painfully aware that I’m muddling my way through tools and processes that I don’t understand. But I find it interesting that the technology is forgiving enough to make it possible for me to produce such a thing, even if I’m getting every step wrong.

An index card with a long todo list on it: “description”, “update transcript”, “pick title”, “section transcript”, “(names)”, “bounce MP3” etc

I’m also conscious that parts of this are more labour-intensive than necessary. Sometimes that’s because I’m being needlessly detail-oriented, and other times it’s because I’m doing a laborious task manually even though I know it can be automated but suspect that learning how would take even longer. This is frustrating but is also, I think, a normal part of the learning process for anything self-taught.

So none of this is advice, and this is not a tutorial; it’s just what I do when I’m trying my best at a hobby I know nothing about. If I receive any feedback from people who actually know what they’re doing, I’ll update this post to reflect it.


Starting up

I chose to invest a bit of money and time before recording an episode. Most of this wasn’t necessary — if you have access to a computer (or just a phone/tablet) you can make a podcast without any special equipment or preparation — but I wanted to make the quality as good as I could easily achieve, and that meant buying some extra stuff.

Here are all the jobs I can remember doing:

Get the hardware

Get the software

Register the domain

I registered whyarecomputers.com (£12/year) through Gandi.

Make a basic logo

I can’t draw, so I just took a photo of my computer. (Not literally “in the art room”, although it was at BERG, so that’s pretty close.) I ran the photo through an online image glitcher a few times to get several distorted frames, which I compiled into an animated logo with ImageMagick.

Make a basic site

I wrote static HTML and CSS by hand. I already had a nice webfont from Cloud.typography for codon.com, so I used that. And I already had a Linode VPS for codon.com, so I hosted the site there.

Make theme music

It seemed important to have a little bit of music to break up all the talking and provide continuity between episodes. I can’t write music, so I asked my talented friend Martin to do it.

Make a Twitter account

I created the @whyarecomputers account for announcing new episodes.


Recording the episode

Every episode begins as an unedited recording of a conversation. Getting to that point is the least time-consuming part of the process, but there is a bit of work involved:

Find a guest

This isn’t active work, but it takes forever because my technique is to do nothing and wait for someone good to fall into my lap. For example, almost three years elapsed between episode 1 and episode 2 while I waited for a suitable guest to materialise.

Once I’ve decided on a guest, I contact them and persuade them to come on the show.

Organise a venue if necessary

I try to arrange for us to record in person if that’s practical, because in-person conversations are more relaxed and natural than Skype calls. They’re also easier to edit because the low latency reduces the risk of people talking over each other. (Spill is a downside.)

To record together we need a venue that’s comfortable, quiet and not too echoey. For episode 1 we used the now-defunct Makeshift Shedscraper, and episode 4 was recorded at my flat in Hackney.

Schedule the recording

I want the finished episode to be roughly an hour long. When scheduling the recording I build in lots of slack time for unrushed pre-show warm-up chat, technical difficulty resolution, digressions, breaks, and post-show wind-down chat. A total of two hours is ideal.

I live in London, so scheduling Skype calls with guests in North America can be difficult. The obvious solution is for me to be flexible; episode 2 was recorded from midnight to 2am London time.

Prepare the guest

The guest may have no idea what to expect, so it’s important to proactively tell them anything they need to know rather than wait for them to ask.

If the guest is remote, I ask them to record the audio from their microphone and give them instructions if necessary. I can’t control the quality of the microphone they use, but getting a clean recording from it can at least eliminate Skype-related audio problems.

I also ask remote guests to take a photo of themselves in their recording environment, because I can't take one myself. I’ll use this later to make a header image for the episode page.

Being recorded for an audience is intimidating. I reassure the guest by telling them that:

Prepare myself

This is the first time-consuming task. My primary contribution to the podcast is to have a rough idea of what is going to happen so that the guest can relax and not worry about it.

The more I prepare in advance, the less effort is required during recording. I want the conversation to flow naturally, but ideas chosen at leisure tend to be better than those which occur in the moment, so I like to have the overall structure in mind before we begin. A solid plan creates a more inviting space for spontaneity.

I read about what the guest has done. I already know a bit about them, but I run through their site, blog or Twitter feed and make sure I’m properly aware of everything they’ve said or made recently. I think about everything I’ve found. I take notes about what to ask them and what to say. I look for themes in their work and decide which ones to steer them towards. I look for topics that fit together to make a rough narrative.

Although I would rather have a conversation than an interview, I write straightforward questions as a fallback in case the conversation tails off naturally.

I also write what I’m going to say in the introduction. “Write” is a bit grand — the intro tends to be a combination of boilerplate and facetious nonsense — but it’s better for the guest’s confidence if I begin with something I’ve finalised beforehand rather than a half-hearted ad-lib.

Do the recording

As above, I record with Audacity on my computer. If we’re on Skype, I ask them to do the same (or use QuickTime Player if they prefer) while I record both ends of the call with Audio Hijack for sync and backup purposes.

If the guest joins me in person, I take a photo of them to use as a header image.

I split the recording time between general chat to make us both feel comfortable, and asking questions to get them to talk about specific topics. (Only the latter ends up in the episode.)


Making the MP3 and transcript

Once the recordings have been made, I have to turn them into a finished episode that I can release. This takes the most time by far.

Edit 1: noise removal

The goal of the first edit is to gather all the audio tracks and remove background noise from them.

If the guest recorded themselves, I ask them to send me their audio; if I recorded two microphones myself as a single stereo track, I use Audacity’s “Stereo track to Mono” to split it into separate mono tracks first.

I open each track in Audacity, use the noise profiler to analyse a section that is just background noise, then use the noise reduction tool to remove that noise from the entire track. I export each track from Audacity as a mono WAV, and that’s the last time Audacity is involved.

(I expect Logic Pro X can do better noise reduction than Audacity, but I don’t know how.)

Edit 2: raw

The goal of this edit is to make a single audio file that I can transcribe.

I make a new Logic project and import each speaker’s audio as a separate track.

If the tracks were recorded separately, the first job is to sync them up. Unfortunately this is more difficult than just synchronising them at the beginning, because audio recorded simultaneously on different computers will run at slightly different speeds and gradually drift out of sync. The effect is tiny but becomes significant after an hour or two of cumulative drift.

To fix this, I import the reference audio from the Skype call and use it to manually line up the other tracks at the beginning of the recording. I then skip to the end and use Logic’s time stretch to make each track longer or shorter so that it matches the reference track. This is fiddly and imprecise — and slow, because it takes Logic a long time to re-render the stretched region before I can check it sounds good — but after a few tries I can usually reduce the drift to an acceptable level. Getting it right was especially challenging for episode 3, which had two remote guests. I delete the Skype track once the sync is fixed.

(Marco Arment has written a tool called Sidetrack that fixes drift automatically by splitting tracks at silence and moving them around to match the reference track, but he hasn’t released it yet.)

Once the tracks are synced, I cut off any chat from the beginning and end that is unlikely to make it into the final episode. I pan each track to a slightly different place in the stereo image (usually me at -16 and the guest at +16), adjust the gain of each so that they roughly match, and add noise gates to filter out intermittent noises.

I finish by bouncing the whole project as a stereo WAV.

Transcribe

I transcribe the raw edit as a plain text file. Making a transcript takes a long time, but it’s important for two reasons:

I play the raw edit in VLC, pausing and rewinding it with my keyboard’s media keys while I type words into Vim. This is fairly frustrating because it’s easy to hit the wrong key and lose the playback position, but I don’t know a better tool for it.

At this stage I don’t waste any time on formatting or links, although I do note changes of speaker by starting a new line with their initials (e.g. “TS:”).

Read and think

Until now I won’t have spent any time reflecting on the conversation. The recording itself passes in a bit of a blur, and when I’m transcribing I pay attention to individual words rather than the conversation’s content or large-scale structure.

So now I read the transcript properly. The main goal is to make all the editorial decisions about which parts of the recording I’m going to keep and what order they’re going to go in; a secondary goal is to collect other bits of information that will be useful later.

I think about topics and themes, and take notes about which sections to rearrange or remove entirely to improve pacing and focus. I often backtrack on those decisions when an apparently irrelevant digression turns out to become useful later on. Fortunately it’s much easier to spot structural issues in text than in audio.

While I’m at it, I note down interesting phrases that could serve as either the episode title or the quote I include in the announcement tweet.

A monitor with many Post-it notes attached: “seemingly disconnected things”, “pick ideas out of a bag”, “it seemed like an interesting project”, “huge amounts of curiosity” etc

I also take notes about sections that could become self-contained chapters within the episode, and make a list of outtakes that could be used as a funny cold open.

Edit 3: rough

The goal of this edit is to finish the speech parts of the episode. This is the main editing job, and is extremely time consuming.

I play through the recording in Logic from beginning to end, editing as I go. I make large structural edits according to my notes from the transcript: removing and rearranging sections for pacing and focus, removing my prompts and comments as much as possible, and closing big gaps due to breaks or smalltalk.

Logic Pro X showing a large number of separate audio regions on a timeline

I also make a huge number of tiny manual edits, for which the Wacom tablet is pretty much essential:

The pointlessly meticulous process of making this rough edit takes days and is exhausting, so I usually have a break before I carry on.

Update transcript

Now the transcript is out of date, so I listen back to the rough edit and update the transcript as I go. Since this version of the transcript should be final, I also take the opportunity to add some HTML (mostly paragraphs and links), expand the speaker initials into names, and put section breaks where the chapter boundaries will be.

This listen-through always catches several problems with the audio or the original transcription or both, so I flip between Logic and Vim to fix things as I listen.

Edit 4: final

Once the transcript matches the rough edit, I put the final pieces of polish on the audio. I add the cold open, the intro and outro music, and the musical breaks between the chapters.

I bounce the finished project as an MP3, and that’s the last time Logic is involved.

Add metadata

The audio is now finished, but the MP3 needs appropriate ID3 metadata so that it shows up correctly in podcast clients.

I pick a final title from the interesting phrases I noted during transcripton, write a description for the episode, and come up with descriptive titles for each of the chapters.

I import the MP3 into Podcast Chapters, add the chapter times and names, and export the file again.

Then I import it into iTunes, open the “Get Info” view, and add the rest of the metadata (cover art, title, description, artist, genre, track number). By using “Show in Finder” I find the underlying MP3 file and copy it out of my iTunes library into the Git repository for the whyarecomputers.com site.


Publishing the episode

All that remains is to actually release the episode to the world. Unsurprisingly this involves a few little jobs, mostly because for some reason I maintain the whyarecomputers.com site by hand rather than use a CMS.

Make the episode page

The new episode gets its own page on whyarecomputers.com. I copy the HTML and CSS from the previous episode’s page and remove all the content, leaving only the boilerplate and empty structure.

Embed the MP3 into the episode page

Although it’s not the ideal way to listen to a podcast, I want anyone who lands on the episode page to be able to start playing the audio immediately in their browser. I put a simple <audio> tag in the page and point it at the MP3, which is enough to make browsers show a simple inline audio player.

Add title and description to the episode page

I’ve already picked a title and written a description for the ID3 metadata, so I paste those into the episode page, adding HTML (i.e. links) if necessary.

Add title and description to the front page

I have to remember to add an entry on the site’s front page too, so that people can find the new episode from there.

Update navigation on the previous episode’s page

Each episode page has “previous” and “next” links, so when a new episode is released, I add a “next” link to the previous episode’s page.

Update the RSS feed

The site’s RSS feed is what podcast aggregators use to discover the new episode and notify subscribers about it, so it’s very important to get this right. I add a new entry containing the episode’s title, description, publication time and file size.

Validate the RSS feed

To check I haven’t made any mistakes while hand-editing the RSS feed, I run it through the Feed Validator. This is stricter about correctness than any podcast aggregator I know, so I tend to ignore its more pedantic warnings, but it lets me see immediately whether I’ve made any serious errors.

Add a header image to the episode page

I scale and crop the photo of the guest and add it to the top of the episode page.

Add the transcript to the episode page

Once everything else is ready, the transcript makes up the bulk of the content on the page.

Add chapter titles, links and metadata to the episode page

This is a general finishing-up pass. I add the chapter titles to the transcript and link to them from the table of contents. At the top of the page I add an MP3 download link, subscription links to the RSS feed and iTunes page, links to my and the guest’s Twitter accounts or home pages, a music credit and link, the publication time and recording date, and contact links for Twitter and email.

Publish site changes

The whyarecomputers.com site lives in a Git repository on my local machine, so publishing it is a simple git push web master.

Refresh the iTunes feed

For the very first episode, I used Apple’s Podcasts Connect to submit the podcast to the iTunes Store. I couldn’t do this any earlier because Apple requires an RSS feed containing at least one episode in order to create a listing for the podcast.

For each subsequent episode, I just log into Podcasts Connect and press the “Refresh Feed” button to tell Apple to check the RSS feed. (Doubtless this would happen automatically on some kind of schedule, but I like to proactively trigger it.)

I also habitually check the Overcast directory page for the podcast to check that the new episode is showing up there. I don’t think this does anything, but it gives me confidence that everything is working.

Tweet about it

I pick another quote from the episode and tweet it from the @whyarecomputers account with a link to the episode page. I usually wait a day or two to retweet this announcement from my personal account to reduce the chances of people missing it.

The end

And that’s it! Until the next guest occurs to me…


Conclusion

Thanks for reading all this. I hope you enjoy the podcast — despite the above, I do really enjoy making it and feel privileged that people take the time to listen.

If you’d like to support this ridiculous project or encourage me to make more episodes, by far the most helpful thing would be to rate or review it on iTunes, because that’s how random people on the internet find podcasts. Or just tell your friends!