AI / ML and Music
Updated: Jun 12, 2019
Technology is moving at a fast pace and adding a great deal of capability in the typical home recording studio. This technology is affordable, easy to use and available to the masses. Every part of our society is being changed by automation, machine learning and artificial intelligence. Why would music production be any different? Every home and professional studio is equipped with a Digital Audio Workstation (DAW) which has replaced analog tape decks and mixing consoles. DAW environments provide a high degree of automation and have both simplified the recording process as well as reducing the cost of recording sessions. First you might ask, “What is machine learning?” Well....Computers have always been programed to handle variable data input, but the program always executed the same routines and functions. Machine learning is when the program varies based on the data input it receives. This variance in programing is generated by the program itself, causing it to develop a history or memory of events, which can be used or destroyed based on future data events. Machine learning is the first step in achieving artificial intelligence and is comparable to human cognitive and analytical skills. It is the critical process needed before accurate decision-making can take place and is in practice with DAWs today. Developers apply Machine learning in DAWs by looking at the data that has been recorded and building metadata and declarations about the recorded content. This starts with the audio file for the given track that was recorded. File headers are read by the DAW to determine format, bit depth, word length, etc. Then analysis is performed against the content where the instrument can be determined and any relevant facts about the instrument such as: This is a kick drum, with bleed from the entire drum kit, a kick drum pedal that squeaks, the front head is not tuned correctly and rings and even the characteristics of the microphone / preamp chain. Let’s focus on the audio bleed from the rest of the kit for now. The normal method used to eliminate bleed is done with a noise gate. This dates back to the good old analog days. Noise gates are great and usually get the job done, but generally leave artifacts in the output. Many of these artifacts became desirable and intended. Once analog audio is converted to digital, every function is math-based and can be represented as an algorithm. But we like our plugins to look like the analog rack units of old, with all the same dials and gauges. So imagine setting up a poorly-tuned drum kit in a garage with concrete floors and mic’ing it up with nothing but SM-57s into a $300 multichannel audio interface. Imagine being able to determine what characteristics you want your microphones to have and what preamps you want to use with a point and click. This can be done by modeling these characteristics mathematically, comparing the characteristics of the actual input with the expected input, and generating the new output.
Now, if you apply this same methodology across all of the issues identified in the original recorded content, the potential for resolving recording issues is virtually limitless. There’s debate over the loss of integrity from the original recording, but the result is no different from what many studios today produce by turning drum strikes into triggers and replacing the audio with samples, usually after the recording has been processed with beat detective and pulled into time. In this instance, at least some of the characteristics of the original drums remain. Modeling has been around a long time, and with every year, the technology gets better. This will continue to the point where a
mic’d up 4x12 cabinet will be perceived as nothing more than an ambitious exercise in nostalgia. Vocal processing that retain the integrity of the vocalist’s performance
has long been abandoned. Pitch correction and thickening seem to be part of every recording today, not to mention live performances. This, too, is enabled by machine learning, which allows vocals to be manipulated to a far greater degree. Technological advancement is changing every field. Advanced software is available at affordable prices while labor is, and always will be, expensive. For example, who needs a photography studio when anyone with a smartphone can produce professional quality photographs without investing thousands of dollars in equipment? This holds true for music as well; any smartphone can be turned into a full recording studio complete with the same automation technology available at multi-million dollar studios.
This capability is both good and bad. The good side is that it puts tools in the hands of creative people, and that provides more music to society. The down side is that it floods the marketplace with lower-cost products, which impacts the prices professional studios can charge and thus, the living those professionals can make. There is also the standing debate on quality, which is more of an emotional argument than one based on facts. Sure, there are some bad recordings out there, but there are some masterpieces as well, as was also the case 30 years ago.
Impact on the Music Industry:
Artists in every field have always been at a disadvantage when faced with the business side of their craft. In many cases, art galleries will put art on consignment from artists and take 50% of the proceeds from every sale. The gallery claims this practice exists to cover the costs of the retail space, labor, marketing and utilities, plus a hefty profit. Musicians have it even worse; they usually receive less than 10% of the proceeds from sales of their products. If they are putting on live performances, they may get paid $150 to $500 to play at a bar that pockets thousands from alcohol sales alone. The industry culture is one of banks and lawyers taking advantage of artists. Technology has impacted the middleman the most. As the cost of producing art has decreased, middlemen, such as record labels, get pushed out of business. Their irrelevance benefits artists by putting more money in their pockets and preventing gatekeeping. Musicians
no longer need to get “signed” by a record label and sell their souls in exchange for studio time, production, advertising and distribution. They can now order equipment and software for less than $1000 and produce their own music, souls intact.
Economics of recording today:
The cost of building and running a traditional recording studio is astronomical and is often unsustainable considering the shrinking margins for music sales. Traditionally, a studio is in a space that is acoustically tuned and provides absolute audio isolation room-to-room. Special ventilation is required to minimize noise from HVAC units. Expensive power conditioning units are installed to provide clean power to the audio equipment. Simply preparing the space for a traditional studio would be a multi-million dollar investment. Add to that the cost of microphones, preamps, mixing consoles, converters, timing sources, computers and software and the endeavor can easily exceed millions of dollars. All of this equipment requires maintenance and a trained staff of professionals to operate. This is why studio time is so expensive.
The traditional studio is now obsolete thanks to technological advancements. The digital audio workstation has become a very powerful tool that enables a simple plugin to fix most recording issues. Background noise is eliminated. Tone is warmed up with a preamp plugin which eliminates the need for really expensive preamps and microphones. Machine learning technology is enhancing these plugins, allowing the average user to use them with amazing results.
Technology has already changed the industry:
Today, sound is digital, from the moment that its energy enters a microphone to the point where it reaches digital converters. This means every aspect of the lifecycle can be digital, all the way to the listener’s earbuds. Social media and digital distribution services have replaced the A&R firms of old. Many artists today have been discovered by online listeners, which the legacy industry refers to as a “viral fluke.” But the truth of it is that younger generations do not find their music the same way anymore. Terrestrial radio has been replaced by YouTube and Facebook, where stars are born and die without any notice from the traditional industry. A band getting 1 million views is better than selling 100 thousand records, as long as you are fine with your money and fame coming from non-traditional sources. During its early days, digital distribution was a mess. There were no laws to protect artists, and therefore organizations like Spotify and Pandora were legally depriving musicians out of royalties. Thankfully, there are now laws on the books to protect artists and ensure they receive their cut of advertising and subscription money from these distributors. Musicians now have it better than ever in regards to
owning the rights to their material. The cut digital distributors take today is a fraction of what the record labels of old took. We live in the “There is an app for that” era, wherein many middleman functions have been, or are about to be, automated. This concept extends into every aspect of business, and we as a society will never revert back to the way things were. We never have. This is technological progress and you must either accept it or get left behind.
My Advice to People in the Industry:
Producers, engineers, technicians, artists, and even manufactures do not have an exemption from technological advancement. Change is difficult for many to accept and is generally resisted. This remains true across every industry. I still know people who thought email would be a fad and were certain we would revert back to the office
memorandum, distributed through interoffice mail. You will soon be nothing more than a grumpy old person who yells at the neighborhood kids to get off of your grass if you resist technological advancement. Instead of mourning the way things were, look to what is next and adapt. Software developers usually don't know music and still need
the assistance of music professionals to build their products. If you are a music professional, my advice is to start putting claims on your intellectual property and working with the technology companies to license your craft. So brush up on speaking Klingon and befriend some computer nerds.
Soon, we will have music production software that utilizes artificial intelligence to a higher degree than it does today. This will make an average home recording product better than anything a high-end studio could produce. Instead of looking to go out and hire an engineer or a producer, you will be able to use a certain AI plugin or chain of AI plugins to do the same work. Think of the power of being able to say, “I want to have Jim Scott and/or Trina Shoemaker running the desk for my project.” You will place generic microphones in front of instruments and then select the characteristics of the room,
microphone or preamp, all with plugins. Your AI engine will do the rest, compression, time based effects, etc. Once your mix is done, you can give it a listen and afterward fire Jim and replace him with Ted. And you can do this with no contract escape clauses, arbitration or signing bonuses - just a “drag and drop.” If you are currently working in this field, your best bet will be to join in the technology revolution and at least cash in on the potential distribution of your AI plugin. You will need help copyrighting algorithms, which form your methodology and technique. You will will then have a career that consists of upgrades and software life-cycle management. You can see proof of this formula in hardware manufacturers for audio equipment. They are building and licensing VST plugin bundles for DAW environments. They know that someday,
nobody will be buying expensive hardware, and therefore they had better claim the physical characteristics their product provides to an audio signal. Machine learning recognizes the signal input and steps needed to reform that output to match that of the expensive device. If you are not buying the device anymore, how else will these
companies make money?
So the knowledge base needed to produce music will decrease to the point that the physics involved are only known to the software developers. The discussions about how much compression to apply to a snare drum or what ratio to use will now occur in a conference room rather than the studio control room. Currently, even song composition is being automated which is not surprising as most songs are composed using extremely simple algorithms. Imagine that - a world in which we listen to nothing but
computer-generated tones for entertainment. What would happen to live music? Would we buy tickets to go see a Mac Pro on tour with a Dell laptop? Maybe we already are?
Music is a form of human expression and is meant to be just that. If you take away the human completely, it will lose human appeal. There is something about the legitimacy
of human suffering in music that makes it relatable. Most humans can not relate to the problems or emotions of a computer, so there is a line where technology will stop
intruding on the arts and only provide tools to make artists craft easier. But identifying where that line is and making certain you are on the right side of it is the hard part.
Economics drives technology and for the most part drives everything in life. Thinking that you as an individual have the ability to stop technology is foolish, especially if cost savings and or profits are at play. Automation provides us with convenience and we humans love convenience. We are also willing to sacrifice parts of what makes us
human for convenience. If you have a device in your house that listens for you to spout out commands 24x7, then you fall into this category. We sacrifice our privacy for the ability to tell this device to play some Led Zeppelin on demand. We have robots vacuuming our floors and cameras at our front door all to simplify our lives. Lets face it, life is full of inconveniences to the point where some people would choose to
never get out of bed. The reality of it is some of these inconveniences
are part of what makes life enjoyable and worth living. So one day we may live our lives in an artificial glass womb, floating in a synthetic amniotic fluid with probes and tethers attached to us. All of our conscious thoughts artificially provided by Virtual Reality. If
human nature evolves enough to accept this form of “living” and the economics make it viable, it will eventually be our fate. We have a world culture today based on consumption and convenience, so we really are not that far off from this now!
Music and art is the event horizon when it comes to how much we let technology infringe on humanity. If we choose to accept computer generated tones as music, we should start getting fitted for our glass wombs immediately. If we reject this form of entertainment we may only have to worry about our DAWs becoming “self aware” and trying to destroy all forms of carbon based life or maybe just our Moms.