Downloading all episodes of a podcast

Not a biggie but in case it helps anyone.

I wanted to download all episodes of the excellent “My Dad Wrote a Porno” podcast for posterity. I couldn’t find any way of doing this so here’s what I ended up doing.

First I found the RSS feed. I noticed that it contains the actual audio file in enclosure tags.

Cool, so I just need to read these for a start. I can do that via curl.

curl -s http://rss.acast.com/mydadwroteaporno | grep -o '<enclosure url="[^"]*'

1	curl -s http://rss.acast.com/mydadwroteaporno \| grep -o '<enclosure url="[^"]*'

This gives me all the links thus:

<enclosure url="https://media.acast.com/mydadwroteaporno/bestofbookfour/media.mp3
<enclosure url="https://media.acast.com/mydadwroteaporno/mydadwroteachristmasporno3/media.mp3
<enclosure url="https://media.acast.com/mydadwroteaporno/mydadwroteachristmasporno1/media.mp3

<enclosure url="https://media.acast.com/mydadwroteaporno/bestofbookfour/media.mp3

<enclosure url="https://media.acast.com/mydadwroteaporno/mydadwroteachristmasporno3/media.mp3

<enclosure url="https://media.acast.com/mydadwroteaporno/mydadwroteachristmasporno1/media.mp3

I was able to extract just the URL via a modification to the above snippet to match the beginning double quotes:

curl -s http://rss.acast.com/mydadwroteaporno | grep -o '<enclosure url="[^"]*' | grep -o '[^"]*$'

1	curl -s http://rss.acast.com/mydadwroteaporno \| grep -o '<enclosure url="[^"]' \| grep -o '[^"]$'

Now all I needed to do was download these and also rename the “media.mp3” to be the directory name from the path. The following did that:

for i in $(curl -s http://rss.acast.com/mydadwroteaporno | grep -o '<enclosure url="[^"]*' | grep -o '[^"]*$'); do 
    url=$i
    outfile=$(echo $i | sed 's|https://media\.acast\.com/mydadwroteaporno/||' | sed 's|/media||')
    wget -q $url -O $outfile
done

for i in $(curl -s http://rss.acast.com/mydadwroteaporno | grep -o '<enclosure url="[^"]*' | grep -o '[^"]*$'); do

url=$i

outfile=$(echo $i | sed 's|https://media\.acast\.com/mydadwroteaporno/||' | sed 's|/media||')

wget -q $url -O $outfile

done

I use sed to strip out the domain name and also do the word “media”. What remains is the part of the path I am interested in.