The comparison between the query and the recordings from the database is realized by a nearest-neighbor search of the audio shingles. The retrieval is based on audio shingles, which are short sequences of chroma features capturing properties of the harmonic and melodic content of the audio recordings. In the first scenario, the query is a short audio snippet. Various approaches for learning such audio representations with deep neural networks are proposed, leading to improvements in the efficiency of the search and the quality of the retrieval results. Depending on the respective scenario, one requires task-specific audio representations to compare the query and the database documents. This thesis addresses two different cross-version retrieval scenarios of Western classical music, where the aim is to find the database’s audio recordings that are based on the same musical work as the query. The task is to automatically retrieve documents from a music database that are similar to the query in certain parts or aspects. One access paradigm is known as “query by example,” where a short music excerpt in a specific representation is given as a query. Accessing this data in a convenient way requires flexible retrieval strategies. Ongoing digitization efforts lead to vast amounts of music data, e.g., audio and video recordings, symbolically encoded scores, or graphical sheet music. The current limitations of the proposed approach are discussed in the context of version identification and query-by-humming, and possible solutions and future research directions are proposed. We propose a melody-based retrieval approach, and demonstrate how melody representations extracted from recordings of a cappella singing can be successfully used to retrieve the original song from a collection of polyphonic audio. Finally, we demonstrate how the proposed version identification method can be adapted for the task of query-by-humming. Furthermore, we show that by combining these tonal representations we can increase version detection accuracy. Our results show that whilst harmony is the most reliable music representation for version identification, the melody and bass line representations also carry useful information for this task. ![]() First, we evaluate the accuracy obtained using individual descriptors, and then we examine whether performance can be improved by combining these music representations (i.e. These descriptors are then employed to retrieve different versions of the same musical piece using a dynamic programming algorithm based on nonlinear time series analysis. ![]() Given the audio signal of a song, we compute descriptors representing its melody, bass line and harmonic progression using state-of-the-art algorithms. In this study we compare the use of different music representations for retrieving alternative performances of the same musical piece, a task commonly referred to as version identification.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |