Audio-to-midi conversion has been a staple of DAWs and music software for many years now. As far back as 1992 Melodyne by Celemony introduced its revolutionary polyphonic mode. This allowed users to extract pitch and timing information from mono and stereo audio recordings and convert them into midi notes.

An early version of Melodyne displaying converted midi notes

As someone who unfortunately lacks any notable piano-playing skills, I have always been drawn to these tools. In my experience however, these tools are always slightly hit-and-miss. One time they will do a tremendous job while on another occasion it will sound as though there’s been some kind of audio road traffic accident. Even on a good extraction, there’s generally always a bit of a tidy-up job to be done.

So for me, the hunt for the best audio-to-midi conversion tool has become a kind of audio equivalent to the search for the holy grail.

Practical uses of Audio-to-midi conversion.

Surprisingly audio-to-midi conversion is a topic that receives relatively little attention in discussions. So are people sleeping on this technology. What possible uses could there be for this function?

In this era of sample-based production, it’s so easy to just throw a loop into your session and get to work. Alternatively, there are those that shun samples in favour of composing original melodies from scratch. Well, audio-to-midi conversion introduces the potential for an alternative approach to composition. One where you are guided by pre-recorded samples, melodic passages and chords without actually using the audio from th original source.

A common issue for producers and musicians alike is the infamous blank session syndrome. Most tracks start with an idea be it melodic or rhythmic. Converted midi from audio could provide you with a great foundation to build your track upon. Just find some music that speaks to you and extract the midi notes from it. You could have fun re-arranging and re-purposing the extracted midi as you will now have total control over it. With the extensive choice of virtual instruments to our disposal, the musical possibilities are endless.

And if you change the arrangement of the original samples or extracted audio enough then you could potentially avoid any copyright issues. The following video provides some creative examples of this with both drum and melodic audio-to-midi conversion being used.

Alternatively you don’t have to lose the original sample alltogether. Midi from audio extraction is an invaluable tool for layering instruments and drums alike. For those of us that aren’t natural ivory tinklers, it can often be tricky to identify chords and melodic sections. Audio-to-midi can come to the rescue and provide the perfect midi-based overlays to your sampled material.

Here are some common scenarios where this could be useful.

  • Layering drum one-shots onto drum loops
  • Analysing and replaying basslines
  • identifying chords in order to add arpeggiated overlays and FX
  • Building orchestral sections over your existing melody/chords.
  • Adding extra depth and width to instruments by matching to similar sounding virtual instruments.
  • Creating whole new instruments by combining and stacking alternative instruments.

Another offshoot of audio-to-midi conversion that’s becoming increasingly popular is voice-to-midi. Although most voice-to-midi software only provides monophonic conversion it does so in real time! This makes these applications fantastically inspirational for composing. You can literally convey whatever musical ideas you have in your head turning them instantly into for instance a French Horn!

Image courtesy of Andrew Huang

Audio-to-midi converters a plenty

There is now a wide range of audio to midi conversion tools available. Most DAWs feature audio-to-midi conversion facilities. For example, Ableton Live offers 3 separate audio-to-midi converting options. Harmony (polyphonic), melody (monophonic) and drums. The convert to drums option actually analyses drum audio and attempts to split the component drum parts as well as the drum pattern. It does an admirable job on drums but you will often have to go in and switch a few miss judged snares or kicks. As long as you use the hat pattern as a guide and wrong hits are easy to fix.

As well as Ableton Live the following DAWs all offer their own audio-to-midi converting tools.

There are also numerous audio tools that feature audio-to-midi conversion. The most well-known and the pioneer of this technology is the aforementioned Melodyne. Melodyne is an extremely powerful application with audio-to-midi representing only one small part of its capabilities. With such power comes a hefty price tag so this might not be for the casual user.

Luckily there are even some good free midi-to-audio converters out there. Basic Pitch is a web-based application developed by Spotify. Simply drag some audio in, give it a moment to process and before you know it, you have midi notes aplenty.

Samplab is another great application that provides Audio-to-midi conversion. Samplab is cloud-based and has some very powerful features. There is a free version that will only process 10 seconds of audio at a time or a monthly full-feature premium service at $9.99. Samplab can be used in either stand-alone mode or as a VST with your DAW of choice.

The following video by Iamamusicmogul demonstrates Samplabs audio-to-midi features in practice.


Other notable software solutions that include Audio-to-midi conversion include newcomer NeuralNote, A2M, Guitar Tabs X, WIDI Recognition System (Widisoft), and AmazingMIDI (Araki Software)

Monophonic vs Polyphonic

Let’s look at the two main types of melodic audio-to-midi conversion. Monophonic and Polyphonic. Monophonic as the name suggests will only extract single key information. This makes it perfect for extracting basic melodic content that doesn’t involve chords or multiple instrumentation. For example if you want to extract the audio for a bassline then this is going to be your preferred method. Generally, monophonic audio-to-midi extraction is fairly reliable and provides largely accurate results. This is, as with all audio-to-midi conversion, very much determined by the quality and suitability of the source audio.

Polyphonic midi extraction, on the other hand, is an altogether more complex and temperamental beast. The software really has to work hard in order to detect the notes when analysing and interpreting polyphonic material. The golden rule is the simpler the audio you provide, the better. Offering a solo instrument i.e piano will yield much better results than for example a jazz trio of Piano, Drums and Bass. Also things like reverb, phasing, and stereo field all present their own unique challenges. Polyphonic results are often variable and very rarely perfect. In the next section, I will explore some techniques that may help you achieve better results when extracting polyphonically.

The midi-extracting minefield: How to help get the best results.

The key to successful audio-to-midi conversion very much lies in the source audio. The first and possibly most important key is to make sure your audio is tuned to international standard pitch (440 Hz for A above middle C as a reference note)

Next up if you have a stereo file convert it to mono. If instruments are panned in the stereo recording make sure you select the channel that best reveals the melody you want to convert.

If your audio source is complex with various musical elements you can use stem technology to isolate the instrument you want to convert to midi. For example, you may want to isolate and capture the bassline of a musical passage. Simply separate the bass using a stem-separating tool such as LaLaAi or Izotopes RX7/10. You will then be able to provide the exact audio needed for analysis.

Screenshot of the fantastic cloud based LaLaAi stem seperating tool


One final trick that may help if your source audio has a lot of reverb is to use a reverb-reducing tool such as Izotope RX De-Reverb. You can also try using transient shapers to reduce any reverb tails. Audio-to-midi software tends to struggle when long reverb tails overlap notes.

Audio-to-Midi shootout

OK, so we now know all about Audio-to-midi conversion and what it does. Time to put a selection of popular converters to the test!

For this shoot-out, I have used the same Piano audio sample and converted it to midi using 4 different conversion tools. The sample I have selected shouldn’t represent too much of a challenge as it is solo piano, perfectly in key and fairly dry. It does however contain numerous rich jazzy chords and melodic riffs and flourishes. If you’re digging this Piano sample it is by sample maker Da Fingaz and features on his excellent pack ‘Majestic Piano’.

So the 4 Midi-to-audio converters I am using for this shoot out are:

1. Ableton Live

2. Melodyne (Celemony)

3. Basic Pitch (Spotify)

4. Samplab

I have provided the results in the following video so you can hear for yourself the resulting Midi extractions.

Shoot out results

So as you can hear each application gave noticeably different results. So here are my conclusions –

Ableton Live
Our first contender certainly threw up the most duff notes. This could have a lot to do with the fact that Ableton generally provides a higher volume of notes than other converters. This can be a good thing as long as you’re willing to go in and do a bit of a cleanup job. A little cleanup tip in Ableton is to identify any keys that are not in the samples scale. if you see any imposter notes in there simply select and clear that whole keys content in one hit.

Melodyne

To my ears Melodyne has probably produced the most usable result straight off the bat, There is only one duff note in there and the result feels relatively natural and humanized. There are certainly still areas of the conversion that would benefit from some additional tweaking but on the whole its done a very respectable job.

Basic Pitch
I’ve only recently discovered Basic Pitch and for a totally free application, this really is an incredible tool. In this test it gave Melodyne a real run for its money. If anything it out performed Melodyne as I didnt detect any out of key notes. And the great thing with Basic Pitch is you have editable parameters that control the note capture. So in theory with a little more tweaking it could do an even better job. I can’t really say anything negative about Basic Pitch, Bravo!

Samplab
Because the standard version of Samplab only allows for 10 seconds of audio editing at a time I actually did 3 separate extractions, roughly re-assembling them for this test.

That said Samplab probably produced the most disappointing result. As you can hear (and see) compared to the other contenders Samplab has not extracted as many notes with many of the chords only represented by 1 or 2 keys. I would much prefer the higher yet slightly less reliable note count of Ableton. At least you can then remove unwanted notes. The result also sounds a bit more clunky than the other converters. The key velocities are higher resulting in a slightly ham-fisted feel to the performance.

The one area that samplab does have an advantage over competing converters is its ability to separate stems pre-midi conversion. This is a fantastic feature that could really help when trying to extract midi from more musically complex audio.

Conclusion

While this test has provided valuable insights, it should only be regarded as a rough guide. Audio-to-midi converters all react in different ways to different source materials due to their varying algorithms. The piano used in this test, although not super basic, is a well-recorded solo instrument. Therefore, programs such as Samplab may do much better when faced with more challenging source material.

I think the key takeaway here is if you have a number of audio-to-midi converters at your disposal try out different ones for yourself. As evidenced in this comparison they will all throw up different results. By generating different extractions you would then be able to in effect cherry-pick the best bits. For example in this test, although Samplab probably faired the worst it did generate the nicest final chord!!! Until they create the perfect audio-to-midi extractor then this is probably going to be your most effective way to get the most from this technology.

And talking of perfect audio-to-midi conversion, could this one day be possible? I am actually surprised that with the current integration of AI technologies in many areas of audio processing they have not got nearer to this goal. I think that at some point AI should be able to analyse the extracted midi results and apply humanistic natural playing styles to these. How incredible would it be to feed your converter with some great source music, extract the midi information and present it in the piano playing style of say, Herbie Hancock!!!

Well, that’s the holy grail for me at least. In the meantime I guess I should probably just sign up for some piano lessons and study more music theory.


Remember – RouteNote Create subscriptions start from as little as $2.99, and you also get 10 FREE credits to spend on samples when you sign-up as well as your FREE sample bundle!