First pass parsing MIDI

Sample MIDI
Sample MIDI

So. What is this MIDI stuff anyways? I touched on it while I said specifications can be fun and it’s about time I dive further into this rather old technology.

A communications protocol for musical instruments

At the core the title says it all. It’s a serial protocol that can be driven by a UART at 31250 baud.

For a remote controller, such as a simple instrument; you’d probably need only NOTE_ON and NOTE_OFF. These two commands happen to take two additional parameters. Those are in order to determine which note to play, and at what velocity.

If you want to make your own little MIDI controller, there are plenty of things you can google but I think “Arduino Midi Controller” will get you very far! Good luck, there’s a lot of fun stuff ahead.

A file format used to store sequential events for triggering

Believe me. I had a couple of hours online looking at various attempts to explain the MIDI file format. While it really is simple, there seems to be no good explanations for it. So in order to understand it, I made a well known test MIDI file. You can see this file in the topmost image, and I got it done with the help of three or four resources. Linked here for your reference:

Now, let’s have a look at my red marker handywork:

Hand parsed MIDI
Hand parsing MIDI

Nothing makes much sense from that image, I get that. Though, selfish as I am this post is mainly intended for myself in one or two months time. As I’m probably dumber then, than you are now – let me try to make sense of all this!

File format; MIDI 0

Header chunk

The header is always the first 14 bytes.

[M T r d L1 L2 L3 L4 F1 F2 T1 T2]

  • Lx is length, always 0 0 0 6
  • Fx is format, one of 00, 01 or 02
  • Tx is time division, midi ticks per quarter note.
    • division = (T1 << 8) | T2

After the header comes the track chunk (for MIDI 0 this is always 1 track)

The track chunk and its data

[M T r k  { variable_length, event }, … ]

The variable length is a way to pack integers that are more than 127 into several bytes. There is just a few things to note. All the bytes except the least significant one, must have its most significant bit set. The least significant byte cannot have the most significant bit set. This means, that a time of variable_length = 0x83 0x8c 0x08 will become:

time = (0x83 & 0x7F) << 14 | (0x8c & 0x7F) << 7 | 0x08

We use to logical and with 0x7F or 0b01111111 to get rid of that most significant bit. Then we must shift by 7 because there is not 8th bit of information in these variable length values.

(currentByte & 0x80) == 0x80 means we need to truncate away the highest bit, store this value shifted by 7 and continue on until we reach a byte which does not satisfy this test.

0 is a perfectly valid length as a variable length entry.

These events can be divided into midi events, meta events and sysex events. Let’s assume you’ve read one byte at the beginning of a track chunk data entry. You already know the length of the entire data string

I will sort them by their identifier, then tell you how many bytes of information they are made of.

  • Midi Event (<= 0xEF)
    • Beginning with  0xC0, 0xD0 -> read one more byte of data
    • Beginning with 0x80, 0x90, 0xA0, 0xB0, 0xE0 -> read two more bytes of data
  • Meta Event (0xFF)
    • Parse through the variable length -> read in length bytes
  • Sysex Event (0xF0 || 0xF7)
    • Parse through the variable length -> read in length bytes

MIDI ticks and MIDI clock

Let’s say we have a midi file that’s made with 60 BMP and we get the following events:

00 ff 51 03 0f 42 40

  • This message has a delta tick == 00
  • We know it’s a meta event because of the ff
  • The next byte 51 indicates a set tempo event
  • It has a payload length of 03
  • 0x0f4240 = 1000000 microseconds per quarter note

00 ff 58 04 04 02 18 04

  • This message has a delta tick == 00
  • We know it’s a meta event because of the ff
  • The next byte 58 indicates a time signature event
  • It has a payload length of 04
  • It’s time signature numerator is 04
  • It’s time signature denominator is 2 ^ 02
  • It has 0x18 = 24 MIDI clocks per metronome tick
  • It has 04 1/32 notes per 24 MIDI clocks

MIDI tick calculation

From the MThd we get duration. Ticks per quarter note. For the 60 BPM example, you’d typically get 0x03c0 which is 960 decimal.

The midi ticks occur every MicrosecondsPerQuarterNote / TicksPerQuarterNote. For us, this would mean:

tickEveryMicroseconds = 1000000 / 960;

Approximately 1041.7 microseconds per MIDI tick

BPM calculation

BPM = (60000000.0f / (float)MicrosecondsPerQuarterNote) * ((float)TimeSignatureDenominator / (float)TimeSignatureNumerator);
BPM = (60000000.0f / 1000000.0f) * (4.0f / 4.0f);

BPM = 60


C++ code example

This should get you somewhere. If not I’ll try to keep my Gist updated:

Good luck and I’ll catch you later!

First pass parsing MIDI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s