Launched as a demo in July 2010, SoundOut Search is based on out our latest program, CFL Search, which has been under development for the past six months. SoundOut Search takes requests which describe the sort of music required and delivers back a list of tracks which might meet the requirement and lets the user listen. If they are particularly interested in one of the tracks, they can click a More button to see what else matches the track they are most interested in. This, of course, has to happen in a relatively short time frame.
The raw materials
Reviews
SoundOut Search uses data from Slicethepie, who collect tracks from unsigned bands and present them on their web site. Over the past three years they have recruited thousands of reviewers, ’scouts’ , who provide feedback to the artists. The scout reviews are written online after listening to at least a minute of the track, but without knowing who the artist is or what the track is called. They can write as much as they like, but the average length of a review is 40 words. One distinctive feature of the reviews is that they are mostly written as a one-sided conversation; that is they don’t necessarily read like written material but more like a transcribed conversation. Many have no punctuation of uppercase, for example, both features of writing but absent from speech. Another distinctive feature is that they are pretty well guaranteed to be about different aspects of what the scout has just heard, rather than general discussions on life or personal preferences. And finally, the reviews are not published on line, so none of the scouts know what others are saying. The aim is to give the artists objective feedback, and to identify the tracks that create the biggest buzz.
Briefs
Requests for music can come from a wide range of sources: advertisers looking for suitable music to accompany an ad, film-makers looking for backing music for scenes, general listeners looking for music to cheer them up on a rainy day… These tend to be written rather than conversational, and can include scene-setting, general background and a lot of other material not noticeably related to music at all.
Here are some we were given during the development phase. You can copy and paste these into the demo and see and hear what comes out.
A sultry jazz track with a sassy sexy female vocal and moody brooding feel
pop music with female vocals uplifting and anthemic
The track should be sentimental, soothing and haunting. Like a lullaby that makes me want to fall asleep. It must instill feelings of longing and sadness.
The requirement
To take a selection of reviews for each track in the collection and produce a way of summarising them for future searching
To analyse the briefs, identify the music requirement and match it with the tracks which most closely match that requirement.
The Solution - Semantic searching
Semantics is the linguistic term for the meaning of a group of words. What we aim to do is to identify first what a scout is reviewing about a track and second what they think about that element. e.g. “sounds awesomely sweet to the ears with the kind of rhythm the song” rhythm = awesomely sweet. To do this we make use of morphology, the way words are formed, and structure, the way words are organised. We analyse the input in whatever messy state it is entered, and use the terms that the SoundOut team have decided their users will want to know about to provide a comprehensive picture of what the scouts are talking about and how they are talking about it.
We analyse and categorise the incoming briefs in a similar way and then match the two things together. Ranking is an important task when there is a lot of data. Ranking is done on the level of similarity to what was asked when a brief is used as input. When the More button is pressed, the full summary of that track is used as input to the comparison system, and again the tracks which are most similar in the things described and the way they are described come to the top of the list.
Further applications
CFL Search can be applied to other fields where there are multiple descriptions; the travel industry is one such example with reviews of the same holiday destination, hotel and attractions. The linguistic element of the program remains the same, but the trigger words can be different. CFL Search has an API which has been designed to allow maximum user flexibility, can interface with databases, (SoundOut Search uses SQL Server, for example), and can return results in a variety of formats.