Taming Your Music Collection: Filename To ID3

Extracting The ID3 Info
     With everything in place to determine the files needing ID3 info, we can begin the functionality for extracting said data out of the file structure. To do this, we'll use regular expressions—more specifically, the RE library. You can learn all about regular expressions in Python at http://docs.python.org/library/re.html.

import sys, os, fnmatch, shutil, argparse, re
from mutagen.easyid3 import EasyID3
from mutagen.id3 import ID3, TIT2

def getTrackNumber(track):
    if track.find("/") > -1:
        return getTrackNumber(track[0:track.find("/")])
    elif track.isdigit():
        return track.zfill(2)

def getID3FromFilename(file, id3info):
    filename = os.path.basename(file)
    m = re.match(r"(?P<trackNumber>\d{2})\. (?P<artist>((?! -).)+) - (?P<title>[^\.]+)\.mp3", filename)

    id3info["tracknumber"] = m.group('trackNumber')
    id3info["artist"] = m.group('artist')
    id3info["title"] = m.group('title')
    id3info["album"] = file.split('/')[-2]

    print id3info

def move(output, file, dirMatch):
    # Check if there are ID3 tags to begin with, if not, it will complain
        tag = ID3(file)
        tag = ID3()
        tag.add(TIT2(encoding=3, text=["Title"]))

    id3info = EasyID3(file)

        _trackNumber = getTrackNumber(id3info["tracknumber"][0])
        _artist = id3info["artist"][0]
        _title = id3info["title"][0]
        _album = id3info["album"][0]
    except KeyError:
        if dirMatch:
            getID3FromFilename(file, id3info)


    outputDir = output + "/" + _artist + "/" + _album + "/"
    outputFile = _trackNumber + ". " + _artist + " - " + _title + ".mp3"

    if not os.path.exists(outputDir):

    shutil.move(file, outputDir + outputFile)

def main():
    parser = argparse.ArgumentParser()

    parser.add_argument('-d', '--directory', nargs=1, required=True, help='')
    parser.add_argument('-o', '--output', nargs=1, required=True, help='')

    args = parser.parse_args()

    directory = os.path.abspath(args.directory[0])
    output = os.path.abspath(args.output[0])

    dirMatch = (directory == output)

    for root, subFolders, filenames in os.walk(directory):
        for filename in fnmatch.filter(filenames, '*.mp3'):
            move(output, os.path.join(root, filename), dirMatch)

$ ./mp3-tagger.py -d /Music/ -o /Music/
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'10'], 'title': ['Arriving Abroad'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'06'], 'title': ['Jubilee'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'03'], 'title': ['Last Song Home'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'02'], 'title': ['October Night in B Minor'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'07'], 'title': ['Tea For Two'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'09'], 'title': ['Bliss & Disguise'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'04'], 'title': ['Downfall'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'01'], 'title': ['Untameable Fire'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'08'], 'title': ['Poem of Haste & Rest'], 'artist': [u'Pierre Langer']}
{'album': [u'Acoustic Guitar Vol 1'], 'tracknumber': [u'05'], 'title': ['A Little Chronology'], 'artist': [u'Pierre Langer']}
     Success! We were able to extract all of the necessary information from the file path using regular expressions to fill in the ID3 tags we wanted. Regular expressions can be a little tricky to master so I will actually explain, in detail, what my regular expression is doing. Hopefully it will help you if your file structure and preferred filename pattern differ from mine. Let's get started!
  • [trackNumber]: A named capture group. [\d{2}]
    • Any digit, exactly 2 repetitions
  • \.
    • Literal .
    • Space
  • [artist]: A named capture group. [((?! -).)+]
    • [3]: A numbered capture group. [(?! -).], one or more reptitions
      • (?! -).
        • Match if suffix is absent. [ -]
          • -
            • Space
            • -
        • Any character
  • -
    • Space
    • -
    • Space
  • [title]: A named capture group. [[^\.]+]
    • Any character that is NOT in this class: [\.], one or more reptitions
  • \.mp3
    • Literal .
    • m
    • p
    • 3
I'll translate that into English for those who don't speak in regular expressions.
Regex English
(?P<trackNumber>\d{2}) Match exactly 2 digits at the beginning of the string and name the match 'trackNumber'
\. Match a literal period character followed by a space
(?P<artist>((?! -).)+) Match any characters until reaching a space followed by a dash (-) and name the match 'artist'
- Match a space followed by a dash (-) followed by a space
(?P<title>[^\.]+) Match any characters until reaching a literal period and name the match 'title'
\.mp3 Match the extension ".mp3"
     Since we've successfully tested the functionality for extracting the ID3 info, we can move on to our final step: Save the ID3 data to the file and enjoy seeing it pop up on our music player of choice, properly formatted and lookin' like a million bucks.