Exploring the Harmony of Deep Learning and Traditional Persian Music: A Journey of Modelling and Music Improvisation | by Behrad Gharedaghloo | Dec, 2023


A robot symbol of the AI model built and calibrated in this project who plays its own generate traditional Persian music with Oud (image generated by Bing Image Creator )

In the vast expanse of music improvisation, where classical music and jazz genres particularly have been explored [1][2][3][4], my curiosity took me towards the enchanting world of traditional Persian music. Being an amateur musician with some experience in Persian music and its instruments, coupled with a background in engineering and science and in domains of machine learning and deep learning, I embarked on a hobby project that united my two passions: music and mathematics. The mission: to build a deep learning model capable of navigating the complex and nuanced realm of traditional Persian musical styles.

From the project’s outset, I recognized unprecedented challenges absent in prior studies. The abundance of microtones [5] in traditional Persian music, variations in modes, intricate motif structures, and substantial ornamentation — which is present in all traditional eastern music including Persian one — presented unique challenges. Additionally, differences in dataset size especially its digitized form underscored the distinctive nature of these challenges in the project. This necessitated the development of a dedicated tool capable of incorporating these nuances while constructing an inventive solution.

The genesis of this musical odyssey lay in the accurate collection and digitization of the “Radif” music [6], sourced carefully from the “Shour”[7] chapter of the Radif book by Dariush Talai [8]. This process, spanning over 10s of hours and extending across three months, served as a testimony to the dedication required to build the model. Digitizing the notes, then I defined dictionaries to encode and decode the notes, and later to transform these notes into one-hot vector sequences that would feed into the neural network model.

The implementation of the model was done using Keras in Python and by leveraging Long Short-Term Memory (LSTM) layers allowing the model to capture the intricate nuances of traditional Persian music. The overall code structure was obtained from Keras.io on character-level text generation [9] and the final model architecture was obtained from an article by SigurĂ°ur SkĂşli Published in Towards Data Science [10].

The journey faced its first challenge when the initial models failed to fit the input music sequences, revealing a significant underfitting issue. This setback necessitated a deep dive into the architecture of the model, and restructuring it that ultimately yielded a more robust model and more realistic music generations. The number of layers and the number of input notes before generating the next note were two important factors that significantly improved model accuracy. In its latest form, the possibility of generating every note correctly via feeding its prior notes was more than 95%.

In addition to the model calibration challenges, fine-tuning the temperature during the improvisation step became an art in itself — finding the delicate balance where the model neither looped into repetitive and predictable patterns nor veered into chaotic randomness.

The task of producing music files introduced a novel challenge. The existing libraries and MIDI files, crafted under the assumption of “Equal Temperament” that is the base for Western music. However, as mentioned above the Equal Temperament assumption is not valid in the face of the intricate scales of most eastern music forms including Arabic Maqam Music and traditional Persian music. To surmount this obstacle, I made my own sound library that would consider the microtonality by sampling note sequences using my Oud instrument. Using the library, the note sequences generated by the model (in improvising mode) were auralized into WAV files. Below are some examples of the generated music (and more are yet to come) on Soundcloud.

The project delivered a model capable of not only generating and improvising in the traditional Persian music landscape, with a specific focus on the Shour mode. Personally, this project served as a profound exploration, a journey that enriched my understanding of deep learning methods while unraveling new dimensions challenges and possibilities.

As I think about the course for future endeavors, the refinement of the model indeed comes to mind. Enriching the calibration data with an expanded array of music samples (e.g. from others modes and other Radif sources) becomes the next movement — a process demanding significant human effort that is the main blocker at this point. Collaboration with subject matter experts, especially those with a sharp ear for traditional Persian music, will be sought to refine the model’s ability to generate music that not only adheres to technical standards but also displays the authentic beauty of traditional Persian music. Exploring alternative avenues involves incorporating notes from diverse genres such as Jazz and Blues into the input data, aiming to assess the artistic merit of fusion music pieces that may emerge.
In the harmonious intersection of my passion for music and experience in Python and Deep Learning, this project stands as a testament to the interesting and new possibilities when art and technology intertwine. I eagerly anticipate the evolving cadence of this deep learning model as it walks further into into the beautiful realms of traditional Persian music — a journey where the echoes of the past blends with the algorithms of the present and future.

[1] Gillick, J., Tang, K., & Keller, R. M. (2010). Machine learning of jazz grammars. Computer Music Journal, 34(3), 56–66.

[2] Shunit Haviv Hakimi, Nadav Bhonker, and Ran El-Yaniv. BebopNet: Neural Models for Jazz Improvisations. Towards Data Science. Retrieved on 21st December 2023 from https://towardsdatascience.com/bebopnet-neural-models-for-jazz-improvisations-4a4d723d0b60.

[3] Kritsis, K., Kylafi, T., Kaliakatsos-Papakostas, M., Pikrakis, A., & Katsouros, V. (2021). On the adaptability of recurrent neural networks for real-time jazz improvisation accompaniment. Frontiers in artificial intelligence, 3, 508727.

[4] Jonathan C.T. Kuo. AI Classical Music Composer — Bi-LSTM & CNN-GAN. Analytics Vidhya, Retrieved on 21st December 2023 from https://medium.com/analytics-vidhya/ai-classical-music-composer-63d983ee5fc0.

[5] Microtones in Eastern music introduce subtle intervals between conventional Western tones which allows intricate expressions not bound by the fixed intervals of equal temperament.

[6] Radif in Persian music is a traditional repertoire of melodic patterns, providing a framework for improvisation and composition, essential to classical performances.

[7] Shour, also written as Shoor and Shur, is a fundamental mode in Persian music, part of the traditional Radif repertoire.

[8] Talâi, Dariush, ed. Traditional Persian art music: the radif of Mirza Abdollah, 1999.

[9] François Chollet. Character-level text generation with LSTM. Retrieved on December 21, 2023 from https://keras.io/examples/generative/lstm_character_level_text_generation.

[10] SigurĂ°ur SkĂşli. How to Generate Music using a LSTM Neural Network in Keras. Towards Data Science. Retrieved on December 21, 2023 from https://towardsdatascience.com/how-to-generate-music-using-a-lstm-neural-network-in-keras-68786834d4c5



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*