Publications

2017
Abstract: As an emerging sub-field of music information retrieval (MIR), music imagery information retrieval (MIIR) aims to retrieve information from brain activity recorded during music cognition – such as listening to or imagining music pieces. This is a highly inter-disciplinary endeavor that requires expertise in MIR as well as cognitive neuroscience and psychology. The OpenMIIR initiative strives to foster collaborations between these fields to advance the state of the art in MIIR. As a first step, electroencephalography (EEG) recordings of music perception and imagination have been made publicly available, enabling MIR researchers to easily test and adapt their existing approaches for music analysis like fingerprinting, beat tracking or tempo estimation on this new kind of data. This paper reports on first results of MIIR experiments using these OpenMIIR datasets and points out how these findings could drive new research in cognitive neuroscience.
BibTeX:
@article{stober2017frontiers,
  author = {Stober, Sebastian},
  title = {Towards {Studying} {Music} {Cognition} with {Information} {Retrieval} {Techniques}: {Lessons} {Learned} from the {OpenMIIR} {Initiative}},
  journal = {Frontiers in Psychology},
  year = {2017},
  volume = {8},
  note = {to appear},
  url = {http://journal.frontiersin.org/article/10.3389/fpsyg.2017.01255/abstract},
  doi = {10.3389/fpsyg.2017.01255}
}
Abstract: The increase in complexity of Artificial Neural Nets (ANNs) results in difficulties in understanding what they have learned and how they accomplish their goal. As their complexity becomes closer to the one of the human brain, neuroscientific techniques could facilitate their analysis. This paper investigates an adaptation of the Event-Related Potential (ERP) technique for analyzing ANNs demonstrated for a speech recognizer. Our adaptation involves deriving a large number of recordings (trials) for the same word and averaging the resulting neuron activations. This allows for a systematic analysis of neuron activation to reveal their function in detecting specific letters. We compare those observations between an English and German speech recognizer.
BibTeX:
@inproceedings{krug2017ccn,
  author = {Andreas Krug and Sebastian Stober},
  title = {Adaptation of the Event-Related Potential Technique for Analyzing Artificial Neural Nets},
  booktitle = {Conference on Cognitive Computational Neuroscience (CCN'17)},
  year = {2017}
}
Abstract: End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural network originally trained for English ASR to the German language. We show that this technique allows faster training on consumer-grade resources while requiring less training data in order to achieve the same accuracy, thereby lowering the cost of training ASR models in other languages. Model introspection revealed that small adaptations to the network’s weights were sufficient for good performance, especially for inner layers.
BibTeX:
@inproceedings{kunzeKKKJS2017acl,
  author = {Kunze, Julius and Kirsch, Louis and Kurenkov, Ilia and Krug, Andreas and Johannsmeier, Jens and Stober, Sebastian},
  title = {Transfer Learning for Speech Recognition on a Budget},
  booktitle = {2n Workshop on Representation Learning for NLP at the Annual Meeting of the Association for Computational Linguistics (ACL'17)},
  year = {2017},
  url = {https://arxiv.org/abs/1706.00290}
}
Abstract: This paper introduces a pre-training technique for learning discriminative features from electroencephalography (EEG) recordings using deep neural networks. EEG data are generally only available in small quantities, they are high-dimensional with a poor signal-to-noise ratio, and there is considerable variability between individual subjects and recording sessions. Similarity-constraint encoders as introduced in this paper specifically address these challenges for feature learning. They learn features that allow to distinguish between classes by demanding that encodings of two trials from the same class are more similar to each other than to encoded trials from other classes. This tuple-based training approach is especially suitable for small datasets. The proposed technique is evaluated using the publicly available OpenMIIR dataset of EEG recordings taken while participants listened to and imagined music. For this dataset, a simple convolutional filter can be learned that significantly improves the signal-to-noise ratio while aggregating the 64 EEG channels into a single waveform.
BibTeX:
@inproceedings{stober2017icassp,
  author = {Sebastian Stober},
  title = {Learning Discriminative Features from Electroencephalography Recordings by Encoding Similarity Constraints},
  booktitle = {Proceedings of 42nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'17)},
  year = {2017},
  pages = {6175--6179}
}
Abstract: We compare Visual Berrypicking, an interactive approach allowing users to explore large and highly faceted information spaces using similarity-based two-dimensional maps, with traditional browsing techniques. For large datasets, current projection methods used to generate maplike overviews suffer from increased computational costs and a loss of accuracy resulting in inconsistent visualizations. We propose to interactively align inexpensive small maps, showing local neighborhoods only, which ideally creates the impression of panning a large map. For evaluation, we designed a web-based prototype for movie exploration and compared it to the web interface of The Movie Database (TMDb) in an online user study. Results suggest that users are able to effectively explore large movie collections by hopping from one neighborhood to the next.
BibTeX:
@inproceedings{low2017mmm,
  author = {Thomas Low and Christian Hentschel and Sebastian Stober and Harald Sack and Andreas N{\"u}rnberger},
  title = {Exploring Large Movie Collections: Comparing Visual Berrypicking and Traditional Browsing},
  booktitle = {Proceedings of the 23rd International Conference on MultiMedia Modeling (MMM'17)},
  year = {2017},
  pages = {198--208},
  doi = {10.1007/978-3-319-51814-5_17}
}
2016
BibTeX:
@inbook{stober2016mirbook-chapter,
  author = {Sebastian Stober},
  title = {Music Data Analysis: Foundations and Applications},
  editor = {Claus Weihs and Dietmar Jannach and Igor Vatolkin and G\"{u}nter Rudolph},
  chapter = {Similarity-based Organization of Music Collections},
  publisher = {CRC Press},
  year = {2016},
  isbn = {9781498719568},
  url = {https://www.crcpress.com/Music-Data-Analysis-Foundations-and-Applications/Weihs-Jannach-Vatolkin-Rudolph/p/book/9781498719568}
}
Abstract: This work introduces a pre-training technique for learning discriminative features from electroencephalography (EEG) recordings using deep artificial neural networks. EEG data are generally only available in small quantities, they are high-dimensional with a poor signal-to-noise ratio, and there is considerable variability between individual subjects and recording sessions. Similarity-constraint encoders as introduced here specifically address these challenges for feature learning. They learn features that allow to distinguish between classes by demanding that encodings of two trials from the same class are more similar to each other than to encoded trials from other classes. This tuple-based training approach is especially suitable for small datasets. The proposed technique is evaluated using the publicly available OpenMIIR dataset of EEG recordings taken while 9 subjects listened to 12 short music pieces. For this dataset, a simple convolutional filter can be learned that is stable across subjects and significantly improves the signal-to-noise ratio while aggregating the 64 EEG channels into a single waveform. With this filter, a neural network classifier can be trained that is simple enough to allow for interpretation of the learned parameters by domain experts and facilitate findings about the cognitive processes. Further, a cross-subject classification accuracy of 27% is obtained with values above 40% for individual subjects.
BibTeX:
@inproceedings{srober2016bc,
  author = {Sebastian Stober},
  title = {Learning Discriminative Features from Electroencephalography Recordings by Encoding Similarity Constraints},
  booktitle = {Bernstein Conference 2016},
  year = {2016},
  doi = {10.12751/nncn.bc2016.0223}
}
Abstract: This paper addresses the question how music information retrieval techniques originally developed to process audio recordings can be adapted for the analysis of corresponding brain activity data. In particular, we conducted a case study applying beat tracking techniques to extract the tempo from electroencephalography (EEG) recordings obtained from people listening to music stimuli. We point out similarities and differences in processing audio and EEG data and show to which extent the tempo can be successfully extracted from EEG signals. Furthermore, we demonstrate how the tempo extraction from EEG signals can be stabilized by applying different fusion approaches on the mid-level tempogram features.
BibTeX:
@inproceedings{stober2016ismir,
  author = {Sebastian Stober and Thomas Pr\"{a}tzlich and Meinard M\"{u}ller},
  title = {Brain Beats: Tempo Extraction from EEG Data},
  booktitle = {17th International Society for Music Information Retrieval Conference (ISMIR'16)},
  year = {2016},
  url = {https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/022_Paper.pdf}
}
2015
Abstract: We introduce and compare several strategies for learning discriminative features from electroencephalography (EEG) recordings using deep learning techniques. EEG data are generally only available in small quantities, they are high-dimensional with a poor signal-to-noise ratio, and there is considerable variability between individual subjects and recording sessions. Our proposed techniques specifically address these challenges for feature learning. Cross-trial encoding forces auto-encoders to focus on features that are stable across trials. Similarity-constraint encoders learn features that allow to distinguish between classes by demanding that two trials from the same class are more similar to each other than to trials from other classes. This tuple-based training approach is especially suitable for small datasets. Hydra-nets allow for separate processing pathways adapting to subsets of a dataset and thus combine the advantages of individual feature learning (better adaptation of early, low-level processing) with group model training (better generalization of higher-level processing in deeper layers). This way, models can, for instance, adapt to each subject individually to compensate for differences in spatial patterns due to anatomical differences or variance in electrode positions. The different techniques are evaluated using the publicly available OpenMIIR dataset of EEG recordings taken while participants listened to and imagined music.
BibTeX:
@article{stober2015arXiv:1511.04306,
  author = {Sebastian Stober and Avital Sternin and Adrian M. Owen and Jessica A. Grahn},
  title = {Deep Feature Learning for {EEG} Recordings},
  journal = {arXiv preprint arXiv:1511.04306},
  year = {2015},
  note = {submitted as conference paper for ICLR 2016},
  url = {http://arxiv.org/abs/1511.04306}
}
Abstract: The ISMIR Paper Explorer allows to browse all papers published at ISMIR using a map-based interface where similar papers are close together. The web-based user interface creates the impression of panning a large (global) map by aligning inexpensive small maps showing local neighborhoods. By directed hopping from one neighborhood to the next, the user is able to explore the whole ISMIR paper collection.
BibTeX:
@inproceedings{stober2015ismirlbd,
  author = {Sebastian Stober and Thomas Low and Christian Hentschel and Harald Sack and Andreas N\"{u}rnberger},
  title = {The {ISMIR} Paper Explorer: A Map-Based Interface for {MIR} Literature Research},
  booktitle = {16th International Society for Music Information Retrieval Conference (ISMIR'15) - Late Breaking \& Demo Papers},
  year = {2015},
  url = {http://ismir2015.uma.es/LBD/LBD41.pdf}
}
Abstract: Music imagery information retrieval (MIIR) systems may one day be able to recognize a song from only our thoughts. As a step towards such technology, we are presenting a public domain dataset of electroencephalography (EEG) recordings taken during music perception and imagination. We acquired this data during an ongoing study that so far comprises 10 subjects listening to and imagining 12 short music fragments – each 7-16s long – taken from well-known pieces. These stimuli were selected from different genres and systematically vary along musical dimensions such as meter, tempo and the presence of lyrics. This way, various retrieval scenarios can be addressed and the success of classifying based on specific dimensions can be tested. The dataset is aimed to enable music information retrieval researchers interested in these new MIIR challenges to easily test and adapt their existing approaches for music analysis like fingerprinting, beat tracking, or tempo estimation on EEG data.
BibTeX:
@inproceedings{stober2015ismir,
  author = {Sebastian Stober and Avital Sternin and Adrian M. Owen and Jessica A. Grahn},
  title = {Towards Music Imagery Information Retrieval: Introducing the OpenMIIR Dataset of {EEG} Recordings from Music Perception and Imagination},
  booktitle = {16th International Society for Music Information Retrieval Conference (ISMIR'15)},
  year = {2015},
  pages = {763--769},
  url = {http://ismir2015.uma.es/articles/224_Paper.pdf}
}
Abstract: The neural processes involved in the perception of music are also involved in imagination. This overlap can be exploited by techniques that attempt to classify the contents of imagination from neural signals, such as signals recorded by EEG. Successful EEG-based classification of what an individual is imagining could pave the way for novel communication technologies, such as brain-computer interfaces. Our study explored whether we could accurately classify perceived and imagined musical stimuli from EEG data. To determine what characteristics of music resulted in the most distinct, and therefore most classifiable, EEG activity, we systematically varied properties of the music. These properties included time signature (3/4 versus 4/4), lyrics (music with lyrics versus music without), tempo (slow versus fast), and instrumentation. Our primary goal was to reliably distinguish between groups of stimuli based on these properties. We recorded EEG with a 64-channel BioSemi system while participants heard or imagined the different musical stimuli. We hypothesized that we would be able to classify which piece was being heard, or being imagined, from the EEG data.
Using principal components analysis, we identified components common to both the perception and imagination conditions. Preliminary analyses show that the time courses of these components are unique to each stimulus and may be used for classification. To investigate other features of the EEG recordings that correlate with stimuli and thus enable accurate classification, we applied a machine learning approach, using deep learning techniques including sparse auto-encoders and convolutional neural networks. This approach has shown promising initial results: we were able to classify stimuli at above chance levels based on their time signature and to estimate the tempo of perceived and imagined music from EEG data. Our findings may ultimately lead to the development of a music-based brain-computer interface.
BibTeX:
@inproceedings{sternin2015smpc,
  author = {Avital Sternin and Sebastian Stober and Adrian M. Owen and Jessica A. Grahn},
  title = {Classifying Perception and Imagination of Music from EEG},
  booktitle = {Society for Music Perception \& Cognition Conference (SMPC'15)},
  year = {2015},
  note = {abstract/poster}
}
Abstract: Electroencephalography (EEG) recordings taken during the perception and the imagination of music contain enough information to estimate the tempo of a musical piece. Five participants listened to and imagined 12 short clips taken from familiar musical pieces — each 7s-16s long. Basic EEG preprocessing techniques were used to remove artifacts and a dynamic beat tracker was used to estimate average tempo. Autocorrelation curves were computed to investigate the periodicity seen in the average EEG waveforms, and the peaks from these curves were found to be proportional to stimulus measure length. As the tempo at which participants imagine may vary over time we used an aggregation technique that allowed us to estimate an accurate tempo over the course of an entire trial. We propose future directions involving convolutional neural networks (CNNs) that will allow us to apply our results to build a brain-computer interface.
BibTeX:
@inproceedings{sternin2015bcmi,
  author = {Avital Sternin and Sebastian Stober and Jessica A. Grahn and Adrian M. Owen},
  title = {Tempo Estimation from the EEG Signal during Perception and Imagination of Music},
  booktitle = {1st International Workshop on Brain-Computer Music Interfacing / 11th International Symposium on Computer Music Multidisciplinary Research (BCMI/CMMR'15)},
  year = {2015}
}
Abstract: In this paper we describe a novel concept of a search history visualization that is primarily designed for children. We propose to visualize the search history as a treasure map: The treasure map shows a landscape of islands. Each island represents the context of a user query. We visualize visited and unvisited relevant results and bookmarked documents for an issued query on an island. We argue that the treasure map may offer several advantages over the existing history mechanisms such as context awareness, appropriate metaphor for children, looping visualization, smaller cognitive load and higher efficiency in refinding information. We discuss design decisions that are important to build such map and interact with it and present the first prototype of the map.
BibTeX:
@inproceedings{gossen2015treasuremap,
  author = {Tatiana Gossen and Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Treasure Map: Search History for Young Users},
  booktitle = {5th Workshop on Context-awareness in Retrieval and Recommendation (CaRR'15) in conjunction with the 37th European Conference on Information Retrieval (ECIR'15)},
  year = {2015}
}
2014
Abstract: Electroencephalography (EEG) recordings of rhythm perception might contain enough information to distinguish different rhythm types/genres or even identify the rhythms themselves. We apply convolutional neural networks (CNNs) to analyze and classify EEG data recorded within a rhythm perception study in Kigali, Rwanda which comprises 12 East African and 12 Western rhythmic stimuli – each presented in a loop for 32 seconds to 13 participants. We investigate the impact of the data representation and the pre-processing steps for this classification tasks and compare different network structures. Using CNNs, we are able to recognize individual rhythms from the EEG with a mean classification accuracy of 24.4% (chance level 4.17%) over all subjects by looking at less than three seconds from a single channel. Aggregating predictions for multiple channels, a mean accuracy of up to 50% can be achieved for individual subjects.
BibTeX:
@inproceedings{stober2014nips,
  author = {Sebastian Stober and Daniel J. Cameron and Jessica A. Grahn},
  title = {Using Convolutional Neural Networks to Recognize Rhythm Stimuli from Electroencephalography Recordings},
  booktitle = {Advances in Neural Information Processing Systems 27 (NIPS'14)},
  year = {2014},
  pages = {1449--1457},
  url = {http://papers.nips.cc/paper/5272-using-convolutional-neural-networks-to-recognize-rhythm-stimuli-from-electroencephalography-recordings}
}
Abstract: Exploring image collections using similarity-based two-dimensional maps is an ongoing research area that faces two main challenges: with increasing size of the collection and complexity of the similarity metric projection accuracy rapidly degrades and computational costs prevent online map generation. We propose a prototype that creates the impression of panning a large (global) map by aligning inexpensive small maps showing local neighborhoods. By directed hopping from one neighborhood to the next the user is able to explore the whole image collection. Additionally, the similarity metric can be adapted by weighting image features and thus users benefit from a more informed navigation.
BibTeX:
@inproceedings{low2014nordichi,
  author = {Thomas Low and Christian Hentschel and Sebastian Stober and Harald Sack and Andreas N{\"u}rnberger},
  title = {Visual Berrypicking in Large Image Collections},
  booktitle = {Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational (NordiCHI'14)},
  year = {2014},
  pages = {1043--1046},
  url = {http://doi.acm.org/10.1145/2639189.2670271},
  doi = {10.1145/2639189.2670271}
}
Abstract: Music imagery information retrieval (MIIR) systems may one day be able to recognize a song just as we think of it. As one step towards such technology, we investigate whether rhythms can be identified from an electroencephalography (EEG) recording taken directly after their auditory presentation. The EEG data has been collected during a rhythm perception study in Kigali, Rwanda and comprises 12 East African and 12 Western rhythmic stimuli presented to 13 participants. Each stimulus was presented as a loop for 32 seconds followed by a break of four seconds before the next one started. Using convolutional neural networks (CNNs), we are able to recognize individual rhythms with a mean accuracy of 22.9% over all subjects by just looking at the EEG recorded during the silence between the stimuli.
BibTeX:
@inproceedings{stober2014audiomostly,
  author = {Sebastian Stober and Daniel J. Cameron and Jessica A. Grahn},
  title = {Does the Beat go on? -- Identifying Rhythms from Brain Waves Recorded after Their Auditory Presentation},
  booktitle = {Proceedings of the 9th Audio Mostly: A Conference on Interaction With Sound (AM'14)},
  year = {2014},
  pages = {23:1--23:8},
  url = {http://doi.acm.org/10.1145/2636879.2636904},
  doi = {10.1145/2636879.2636904}
}
BibTeX:
@inproceedings{stober2014ucnc,
  author = {Sebastian Stober},
  title = {Using Deep Learning Techniques to Analyze and Classify {EEG} Recordings},
  booktitle = {Computational Neuroscience Workshop at Unconventional Computation and Natural Computation Conference (UCNC'14)},
  year = {2014},
  note = {abstract/poster}
}
Abstract: Electroencephalography (EEG) recordings of rhythm perception might contain enough information to distinguish different rhythm types/genres or even identify the rhythms themselves. In this paper, we present first classification results using deep learning techniques on EEG data recorded within a rhythm perception study in Kigali, Rwanda. We tested 13 adults, mean age 21, who performed three behavioral tasks using rhythmic tone sequences derived from either East African or Western music. For the EEG testing, 24 rhythms – half East African and half Western with identical tempo and based on a 2-bar 12/8 scheme – were each repeated for 32 seconds. During presentation, the participants’ brain waves were recorded via 14 EEG channels. We applied stacked denoising autoencoders and convolutional neural networks on the collected data to distinguish African and Western rhythms on a group and individual participant level. Furthermore, we investigated how far these techniques can be used to recognize the individual rhythms.
BibTeX:
@inproceedings{stober2014ismir,
  author = {Sebastian Stober and Daniel J. Cameron and Jessica A. Grahn},
  title = {Classifying {EEG} Recordings of Rhythm Perception},
  booktitle = {15th International Society for Music Information Retrieval Conference (ISMIR'14)},
  year = {2014},
  pages = {649--654},
  url = {http://www.terasoft.com.tw/conf/ismir2014/proceedings/T117_317_Paper.pdf}
}
Abstract: In this paper, we explore alternative ways to visualize search results for children. We propose a novel search result visualization using characters. The main idea is to represent each web document as a character where a character visually provides clues about the webpage’s content. We focused on children between six and twelve as a target user group. Following the usercentered development approach, we conducted a preliminary user study to determine how children would represent a webpage as a sketch based on a given template of a character. Using the study results the first prototype of a search engine was developed. We evaluated the search interface on a touchpad and a touch table in a second user study and analyzed user’s satisfaction and preferences.
BibTeX:
@inproceedings{gossen2014idc,
  author = {Gossen, Tatiana and M\"{u}ller, Rene and Stober, Sebastian and N\"{u}rnberger, Andreas},
  title = {Search Result Visualization with Characters for Children},
  booktitle = {Proceedings of the 2014 Conference on Interaction Design and Children},
  address = {New York, NY, USA},
  publisher = {ACM},
  year = {2014},
  series = {IDC '14},
  pages = {125--134},
  isbn = {9781450322720},
  url = {http://doi.acm.org/10.1145/2593968.2593983},
  doi = {10.1145/2593968.2593983}
}
BibTeX:
@book{amr2012proceedings,,
  title = {Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation},
  editor = {Andreas N\"{u}rnberger and Sebastian Stober and Birger Larsen and Marcin Detyniecki},
  publisher = {Springer International Publishing},
  year = {2014},
  series = {LNCS},
  volume = {8382},
  url = {http://link.springer.com/book/10.1007%2F978-3-319-12093-5},
  doi = {10.1007/978-3-319-12093-5}
}
2013
BibTeX:
@proceedings{mit2013,,
  title = {Tagungsband der Magdeburger-Informatik-Tage, 2. Doktorandentagung 2013, MIT 2013},
  editor = {Robert Buchholz and Georg Krempl and Claudia Krull and Eike Schallehn and Sebastian Stober and Frank Ortmeier and Sebastian Zug},
  publisher = {Magdeburg University},
  year = {2013},
  isbn = {9783940961969}
}
Abstract: Map-based visualizations — sometimes also called projections — are a popular means for exploring music collections. But how useful are they if the collection is not static but grows over time? Ideally, a map that a user is already familiar with should be altered as little as possible and only as much as necessary to reflect the changes of the underlying collection. This paper demonstrates to what extent existing approaches are able to incrementally integrate new songs into existing maps and discusses their technical limitations. To this end, Growing Self-Organizing Maps, (Landmark) Multidimensional Scaling, Stochastic Neighbor Embedding, and the Neighbor Retrieval Visualizer are considered. The different algorithms are experimentally compared based on objective quality measurements as well as in a user study with an interactive user interface. In the experiments, the well-known Beatles corpus comprising the 180 songs from the twelve official albums is used — adding one album at a time to the collection.
BibTeX:
@inproceedings{stober2013ismir,
  author = {Sebastian Stober and Thomas Low and Tatiana Gossen and Andreas N\"{u}rnberger},
  title = {Incremental Visualization of Growing Music Collections},
  booktitle = {14th International Conference on Music Information Retrieval (ISMIR'13)},
  year = {2013},
  pages = {433--438},
  url = {http://www.ppgia.pucpr.br/ismir2013/wp-content/uploads/2013/09/40_Paper.pdf}
}
Abstract: Children need advanced support during web search or related interactions with computer systems. At this point, a voice-controlled search engine offers different benefits. Children who have difficulties in writing will not make spelling errors using a voice control. Voice control is a natural input method and is supposed to be easier to use for children than a keyboard or mouse. To integrate a suitable voice control into search engines, it is necessary to understand the children’s behavior. Therefore, we investigate children’s speech patterns and interaction tactics during a web search using a voice-controlled search engine. A user study in form of a Wizard-of-Oz-Experiment was conducted and we found out that children are motivated to use voice-controlled search engines. However, voice control in combination with touch interactions should be possible as well. Furthermore, the analysis of the speech patterns suggests that it is possible to build a speech recognition program. The results of this study can serve as fundamentals to develop voice-controlled search dialogues for young users.
BibTeX:
@inproceedings{gossen2013hcir,
  author = {Tatiana Gossen and Michael Kotzyba and Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Voice-Controlled Search User Interfaces for Young Users},
  booktitle = {7th annual Symposium on Human-Computer Interaction and Information Retrieval},
  address = {New York, NY, USA},
  year = {2013}
}
BibTeX:
@book{amr2011proceedings,,
  title = {Adaptive Multimedia Retrieval. Large-Scale Multimedia Retrieval and Evaluation},
  editor = {Marcin Detyniecki and Ana Garc\'{\i}a-Serrano and Andreas N{\"u}rnberger and Sebastian Stober},
  address = {Berlin / Heidelberg},
  publisher = {Springer Verlag},
  year = {2013},
  series = {LNCS},
  volume = {7836},
  url = {http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-3-642-37424-1},
  doi = {10.1007/978-3-642-37425-8}
}
BibTeX:
@article{Anglade:2013:RDL:2492334.2492343,
  author = {Anglade, Am{\'e}lie and Humphrey, Eric and Schmidt, Erik and Stober, Sebastian and Sordo, Mohamed},
  title = {Demos and Late-Breaking Session of the Thirteenth International Society for Music Information Retrieval Conference (ISMIR 2012)},
  address = {Cambridge, MA, USA},
  month = {#June#},
  journal = {Computer Music Journal},
  publisher = {MIT Press},
  year = {2013},
  volume = {37},
  number = {2},
  pages = {91--93},
  url = {http://dx.doi.org/10.1162/COMJ_r_00171},
  doi = {10.1162/COMJ_r_00171}
}
Abstract: In dieser Arbeit untersuchen wir Techniken der sprachgesteuerten Interaktion mit Suchmaschinen fü̈r junge Nutzer. Eine Sprachsteuerung hat viele Vorteile für Kinder. Beispielsweise kann der emotionale Zustand aus der Sprache erkannt und zur Unterstützung bei der Suche verwendet werden. Im Folgenden werden die Ergebnisse eines Wizard-of-Oz-Experimentes vorgestellt, bei dem Kinder ein Suchsystem mittels Sprachkommandos bedient haben. Die Ergebnisse der Untersuchung bilden eine Grundlage zur Entwicklung sprachgesteuerter Suchdialoge für Kinder.
BibTeX:
@inproceedings{gossen2013gi,
  author = {Tatiana Gossen and Michael Kotzyba and Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Sprachgesteuerte Benutzerschnittstellen zur Suche f\"{u}r junge Nutzer},
  booktitle = {43. Jahrestagung der Gesellschaft f\"{u}r Informatik},
  year = {2013}
}
Abstract: With the development of more and more sophisticated Music Information Retrieval (MIR) approaches, aspects of adaptivity are becoming an increasingly important research topic. Even though, adaptive techniques have already found their way into MIR systems and contribute to robustness or user satisfaction they are not always identified as such. This paper attempts a structured view on the last decade of MIR research from the perspective of adaptivity in order to increase awareness and promote the application and further development of adaptive techniques. To this end, different approaches from a wide range of application areas that share the common aspect of adaptivity are identified and systematically categorized.
BibTeX:
@article{stober2013mtap,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Adaptive Music Retrieval - A State of the Art},
  journal = {Multimedia Tools and Applications},
  year = {2013},
  volume = {65},
  number = {3},
  pages = {467-494},
  url = {http://link.springer.com/article/10.1007%2Fs11042-012-1042-z},
  doi = {10.1007/s11042-012-1042-z}
}
Abstract: The hubness phenomenon, as it was recently described, consists in the observation that for increasing dimensionality of a data set the distribution of the number of times a data point occurs among the k nearest neighbors of other data points becomes increasingly skewed to the right. As a consequence, so-called hubs emerge, that is, data points that appear in the lists of the k nearest neighbors of other data points much more often than others. In this paper we challenge the hypothesis that the hubness phenomenon is an effect of the dimensionality of the data set and provide evidence that it is rather a boundary effect or, more generally, an effect of a density gradient. As such, it may be seen as an artifact that results from the process in which the data is generated that is used to demonstrate this phenomenon. We report experiments showing that the hubness phenomenon need not occur in high-dimensional data and can be made to occur in low-dimensional data.
BibTeX:
@inproceedings{low2013hubness,
  author = {Thomas Low and Christian Borgelt and Sebastian Stober and Andreas N\"{u}rnberger},
  title = {The Hubness Phenomenon: Fact or Artifact?},
  editor = {Christian Borgelt and Maria \'{A}ngeles Gil and Joao M.C. Sousa and Michel Verleysen},
  booktitle = {Towards Advanced Data Analysis by Combining Soft Computing and Statistics},
  publisher = {Springer Berlin / Heidelberg},
  year = {2013},
  series = {Studies in Fuzziness and Soft Computing},
  volume = {285},
  pages = {267--278},
  doi = {10.1007/978-3-642-30278-7_21}
}
2012
Abstract: Most existing Music Information Retrieval (MIR) technologies require a user to use a query interface to search for a musical document.
The mental image of the desired music is likely much richer than what the user is able to express through any query interface.
This expressivity bottleneck could be circumvented if it was possible to directly read the music query from the user’s mind.
To the authors’ knowledge, no such attempt has been made in the field of MIR so far.
However, there have been recent advances in cognitive neuroscience that suggest such a system might be possible.
Given these new insights, it seems promising to extend the focus of MIR by including music imagery – possibly forming a sub-discipline
which could be called Music Imagery Information Retrieval (MIIR).
As a first effort, there has been a dedicated session at the Late-Breaking & Demos event at the ISMIR 2012 conference.
This paper aims to stimulate research in the field of MIIR by laying a roadmap for future work.
BibTeX:
@inproceedings{ismir2012miir,
  author = {Sebastian Stober and Jessica Thompson},
  title = {Music Imagery Information Retrieval: Bringing the Song on Your Mind back to Your Ears},
  booktitle = {13th International Conference on Music Information Retrieval (ISMIR'12) - Late-Breaking \& Demo Papers},
  year = {2012}
}
Abstract: In order to support individual user perspectives and different retrieval tasks, music similarity can no longer be considered as a static element of Music Information Retrieval (MIR) systems. Various approaches have been proposed recently that allow dynamic adaptation of music similarity measures. This paper provides a systematic comparison of algorithms for metric learning and higher-level facet distance weighting on the MagnaTagATune dataset. A crossvalidation variant taking into account clip availability is presented. Applied on user generated similarity data, its effect on adaptation performance is analyzed. Special attention is paid to the amount of training data necessary for making similarity predictions on unknown data, the number of model parameters and the amount of information available about the music itself.
BibTeX:
@inproceedings{ismir2012stober,
  author = {Daniel Wolff and Sebastian Stober and Andreas N\"urnberger and Tillman Weyde},
  title = {A Systematic Comparison of Music Similarity Adaptation Approaches},
  booktitle = {13th International Conference on Music Information Retrieval (ISMIR'12)},
  year = {2012},
  pages = {103--108}
}
Abstract: Music Information Retrieval (MIR) Systeme müssen fazettenreiche Informationen verarbeiten und gleichzeitig mit heterogenen Nutzern umgehen können. Insbesondere wenn es darum geht, eine Musiksammlung zu organisieren, stellen die verschiedenen Sichtweisen der Nutzer, verursacht durch deren unterschiedliche Kompetenz, musikalischen Hintergrund und Geschmack, eine große Herausforderung dar. Diese Herausforderung wird hier adressiert, indem adaptive Verfahren für verschiedene Elemente von MIR Systemen vorgeschlagen werden: Datenadaptive Techniken zur Merkmalsextraktion werden beschrieben, welche zum Ziel haben, die Qualität und Robustheit der aus Audioaufnahmen extrahierten Informationen zu verbessern. Das klassische Problem der Genreklassifikation wird aus einer neuen nutzerzentrierten Sichtweise behandelt — anknüpfend an die Idee idiosynkratischer Genres, welche die persönlichen Hörgewohnheiten eines Nutzer besser widerspiegeln. Eine adaptive Visualisierungstechnik zur Exploration und Organisation von Musiksammlungen wird entwickelt, die insbesondere Darstellungsfehler adressiert, welche ein weit verbreitetes und unumgängliche Problem von Techniken zur Dimensionsreduktion sind. Darüber hinaus wird umrissen, wie diese Technik eingesetzt werden kann, um die Interessantheit von Musikempfehlungen zu verbessern, und neue blickbasierte Interaktionstechniken ermöglicht. Schließlich wird ein allgemeiner Ansatz für adaptive Musikähnlichkeit vorgestellt, welcher als Kern für eine Vielzahl adaptiver MIR Anwendungen dient. Die Einsatzmöglichkeiten der beschriebenen Verfahren werden an verschiedenen Anwendungsprototypen gezeigt.
BibTeX:
@inproceedings{gi-diss2011stober,
  author = {Sebastian Stober},
  title = {Adaptive Verfahren zur nutzerzentrierten Organisation von Musiksammlungen},
  editor = {Steffen H{\"o}lldobler and Abraham Bernstein and Klaus-Peter L{\"o}hr and Paul Molitor and Gustaf Neumann and R{\"u}diger Reischuk and Myra Spiliopoulou and Harald St{\"o}rrle and Dorothea Wagner},
  booktitle = {Ausgezeichnete Informatikdissertationen 2011},
  address = {Bonn},
  publisher = {Gesellschaft f{\"u}r Informatik},
  year = {2012},
  series = {Lecture Notes in Informatics (LNI)},
  volume = {D-12},
  pages = {211--220},
  isbn = {978-3-88579-416-5},
  note = {in German}
}
Abstract: Surprising a user with unexpected and fortunate recommendations is a key challenge for recommender systems. Motivated by the concept of bisociations, we propose ways to create an environment where such serendipitous recommendations become more likely. As application domain we focus on music recommendation using MusicGalaxy, an adaptive user-interface for exploring music collections. It leverages a non-linear multi-focus distortion technique that adaptively highlights related music tracks in a projection-based collection visualization depending on the current region of interest. While originally developed to alleviate the impact of inevitable projection errors, it can also adapt according to user-preferences. We discuss how using this technique beyond its original purpose can create distortions of the visualization that facilitate bisociative music discovery.
BibTeX:
@inproceedings{stober2012bison,
  author = {Sebastian Stober and Stefan Haun and Andreas N\"urnberger},
  title = {Bisociative Music Discovery and Recommendation},
  editor = {Michael R. Berthold},
  booktitle = {Bisociative Knowledge Discovery},
  publisher = {Springer Berlin / Heidelberg},
  year = {2012},
  series = {Lecture Notes in Computer Science},
  volume = {7250},
  pages = {472-483},
  isbn = {978-3-642-31829-0},
  doi = {10.1007/978-3-642-31830-6_33}
}
Abstract: Personalized and user-aware systems for retrieving multimedia items are becoming increasingly important as the amount of available multimedia data has been spiraling. A personalized system is one that incorporates information about the user into its data processing part (e.g., a particular user taste for a movie genre). A context-aware system, in contrast, takes into account dynamic aspects of the user context when processing the data (e.g., location and time where/when a user issues a query). Today’s user-adaptive systems often incorporate both aspects.
Particularly focusing on the music domain, this article gives an overview of different aspects we deem important to build personalized music retrieval systems. In this vein, we first give an overview of factors that influence the human perception of music. We then propose and discuss various requirements for a personalized, user-aware music retrieval system. Eventually, the state-of-the-art in building such systems is reviewed, taking in particular aspects of “similarity” and “serendipity” into account.
BibTeX:
@incollection{schedl2012user-aware,
  author = {Markus Schedl and Sebastian Stober and Emilia G{\'o}mez and Nicola Orio and Cynthia C.S. Liem},
  title = {User-Aware Music Retrieval and Recommendation},
  editor = {Meinard M{\"u}ller and Masataka Goto and Markus Schedl},
  booktitle = {Multimodal Music Processing},
  address = {Dagstuhl, Germany},
  publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  year = {2012},
  series = {Dagstuhl Follow-Ups},
  volume = {3},
  pages = {135--156},
  isbn = {978-3-939897-37-8},
  url = {http://drops.dagstuhl.de/opus/volltexte/2012/3470},
  doi = {10.4230/DFU.Vol3.11041.135}
}
2011
Abstract: Music Information Retrieval (MIR) systems have to deal with multi-faceted music information and very heterogeneous users. Especially when the task is to organize a music collection, the diverse perspectives of users caused by their different level of expertise, musical background or taste pose a great challenge. This challenge is addressed in this book by proposing adaptive methods for several elements of MIR systems: Data-adaptive feature extraction techniques are described that aim to increase the quality and robustness of the information extracted from audio recordings. The classical genre classification problem is approached from a novel user-centric perspective – promoting the idea of idiosyncratic genres that better reflect a user’s personal listening habits. An adaptive visualization technique for exploration and organization of music collections is elaborated that especially addresses the common and inevitable problem of projection errors introduced by dimensionality reduction approaches. Furthermore, it is outlined how this technique can be applied to facilitate serendipitous music discoveries in a recommendation scenario and to enable novel gaze-supported interaction techniques. Finally, a general approach for adaptive music similarity is presented which serves as the core of many adaptive MIR applications. Application prototypes demonstrate the usability of the described approaches.
BibTeX:
@phdthesis{stober2011thesis,
  author = {Sebastian Stober},
  title = {Adaptive Methods for User-Centered Organization of Music Collections},
  type = {Dissertation},
  address = {Magdeburg, Germany},
  month = {Nov},
  school = {Otto-von-Guericke-University},
  year = {2011},
  note = {published by Dr. Hut Verlag, ISBN 978-3-8439-0229-8},
  url = {http://www.dr.hut-verlag.de/978-3-8439-0229-8.html}
}
BibTeX:
@book{amr2010proceedings,,
  title = {Adaptive Multimedia Retrieval. Context, Exploration and Fusion},
  editor = {Detyniecki, Marcin and Knees, Peter and N\"{u}rnberger, Andreas and Schedl, Markus and Stober, Sebastian},
  address = {Berlin / Heidelberg},
  publisher = {Springer Verlag},
  year = {2011},
  series = {LNCS},
  volume = {6817},
  url = {http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-3-642-27168-7},
  doi = {10.1007/978-3-642-27169-4}
}
Abstract: Surprising a user with unexpected and fortunate recommendations is a key challenge for recommender systems. Motivated by the concept of bisociations, we propose ways to create an environment where such serendipitous recommendations become more likely. As application domain we focus on music recommendation using MusicGalaxy, an adaptive user-interface for exploring music collections. It leverages a non-linear multi-focus distortion technique that adaptively highlights related music tracks in a projection-based collection visualization depending on the current region of interest. While originally developed to alleviate the impact of inevitable projection errors, it can also adapt according to user-preferences. We discuss how using this technique beyond its original purpose can create distortions of the visualization that facilitate bisociative music discovery.
BibTeX:
@inproceedings{audiomostly2011stober,
  author = {Sebastian Stober and Stefan Haun and Andreas N\"{u}rnberger},
  title = {Creating an Environment for Bisociative Music Discovery and Recommendation},
  booktitle = {Proceedings of Audio Mostly 2011 -- 6th Conference on Interaction with Sound -- Extended Abstracts},
  address = {Coimbra, Portugal},
  month = {Sep},
  year = {2011},
  pages = {1--6}
}
Abstract: Similarity plays an important role in many multimedia retrieval applications. However, it often has many facets and its perception is highly subjective — very much depending on a person’s background or retrieval goal. In previous work, we have developed various approaches for modeling and learning individual distance measures as a weighted linear combination of multiple facets in different application scenarios. Based on a generalized view of these approaches as an optimization problem guided by generic relative distance constraints, we describe ways to address the problem of constraint violations and finally compare the different approaches against each other. To this end, a comprehensive experiment using the Magnatagatune benchmark dataset is conducted.
BibTeX:
@inproceedings{amr2011stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {An Experimental Comparison of Similarity Adaptation Approaches},
  editor = {Marcin Detyniecki and Ana Garc\'{i}a-Serrano and Andreas N\"{u}rnberger and Sebastian Stober},
  booktitle = {Adaptive Multimedia Retrieval: Large Scale Multimedia Retrieval and Evaluation},
  address = {Berlin / Heidelberg},
  publisher = {Springer Verlag},
  year = {2011},
  series = {LNCS},
  volume = {7836},
  pages = {99--116},
  doi = {10.1007/978-3-642-37425-8_8}
}
Abstract: Music similarity plays an important role in many Music Information Retrieval applications. However, it has many facets and its perception is highly subjective – very much depending on a person’s background or task. This paper presents a generalized approach to modeling and learning individual distance measures for comparing music pieces based on multiple facets that can be weighted. The learning process is described as an optimization problem guided by generic distance constraints. Three application scenarios with different objectives exemplify how the proposed method can be employed in various contexts by deriving distance constraints either from domain-specific expert information or user actions in an interactive setting.
BibTeX:
@inproceedings{aes42i2011stober,
  author = {Sebastian Stober},
  title = {Adaptive Distance Measures for Exploration and Structuring of Music Collections},
  booktitle = {Proceedings of AES 42nd Conference on Semantic Audio},
  address = {Ilmenau, Germany},
  month = {Jul},
  year = {2011},
  pages = {275--284},
  url = {http://www.aes.org/e-lib/browse.cfm?elib=15952}
}
Abstract: Some popular algorithms used in Music Information Retrieval (MIR) such as Self-Organizing Maps (SOMs) require the objects they process to be represented as vectors, i.e. elements of a vector space. This is a rather severe restriction and if the data does not adhere to it, some means of vectorization is required. As a common practice, the full distance matrix is computed and each row of the matrix interpreted as an artificial feature vector. This paper empirically investigates the impact of this transformation. Further, an alternative approach for vectorization based on Multidimensional Scaling is proposed that is able to better preserve the actual distance relations of the objects which is essential for obtaining a good retrieval performance.
BibTeX:
@inproceedings{admire2011stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Analyzing the Impact of Data Vectorization on Distance Relations},
  booktitle = {Multimedia and Expo (ICME), 2011 IEEE International Conference on},
  address = {Barcelona, Spain},
  month = {Jul},
  year = {2011},
  pages = {1--6},
  note = {part of Proceedings of 3rd International Workshop on Advances in Music Information Research (AdMIRe'11)},
  doi = {10.1109/ICME.2011.6012134}
}
Abstract: While eye tracking is becoming more and more relevant as a promising input channel, diverse applications using gaze control in a more natural way are still rather limited. Though several researchers have indicated the particular high potential of gaze-based interaction for pointing tasks, often gaze-only approaches are investigated. However, time-consuming dwell-time activations limit this potential. To overcome this, we present a gaze-supported fisheye lens in combination with (1) a keyboard and (2) and a tilt-sensitive mobile multitouch device. In a user-centered design approach, we elicited how users would use the aforementioned input combinations. Based on the received feedback we designed a prototype system for the interaction with a remote display using gaze and a touch-and-tilt device. This eliminates gaze dwell-time activations and the well-known Midas Touch problem (unintentionally issuing an action via gaze). A formative user study testing our prototype provided further insights into how well the elaborated gaze-supported interaction techniques were experienced by users.
BibTeX:
@inproceedings{ngca2011stellmach,
  author = {Sophie Stellmach and Sebastian Stober and Raimund Dachselt and Andreas N\"{u}rnberger},
  title = {Designing Gaze-supported Multimodal Interactions for the Exploration of Large Image Collections},
  booktitle = {Proceedings of 1st International Conference on Novel Gaze-Controlled Applications (NGCA'11)},
  address = {Karlskrona, Sweden},
  month = {May},
  year = {2011},
  pages = {1--8},
  note = {Best Paper Award},
  doi = {10.1145/1983302.1983303}
}
Abstract: Sometimes users of a multimedia retrieval system are not able to explicitly state their information need. They rather want to browse a collection in order to get an overview and to discover interesting content. Exploratory retrieval tools support users in search scenarios where the retrieval goal cannot be stated explicitly as a query or user rather want to browse a collection in order to get an overview and to discover interesting content. In previous work, we have presented Adaptive SpringLens — an interactive visualization technique building upon popular neighborhood-preserving projections of multimedia collections. It uses a complex multi-focus fish-eye distortion of a projection to visualize neighborhood that is automatically adapted to the user’s current focus of interest. This paper investigates how far knowledge about the retrieval task collected during interaction can be used to adapt the underlying similarity measure that defines the neighborhoods.
BibTeX:
@inproceedings{amr2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Similarity Adaptation in an Exploratory Retrieval Scenario},
  editor = {Detyniecki, Marcin and Knees, Peter and N\"{u}rnberger, Andreas and Schedl, Markus and Stober, Sebastian},
  booktitle = {Adaptive Multimedia Retrieval. Context, Exploration, and Fusion},
  publisher = {Springer Berlin / Heidelberg},
  year = {2011},
  series = {Lecture Notes in Computer Science},
  volume = {6817},
  pages = {144-158},
  isbn = {978-3-642-27168-7},
  doi = {10.1007/978-3-642-27169-4_11}
}
Abstract: A common way to support exploratory music retrieval scenarios is to give an overview using a neighborhood-preserving projection of the collection onto two dimensions. However, neighborhood cannot always be preserved in the projection because of the inherent dimensionality reduction. Furthermore, there is usually more than one way to look at a music collection and therefore different projections might be required depending on the current task and the user’s interests. We describe an adaptive zoomable interface for exploration that addresses both problems: It makes use of a complex non-linear multi-focal zoom lens that exploits the distorted neighborhood relations introduced by the projection. We further introduce the concept of facet distances representing different aspects of music similarity. User-specific weightings of these aspects allow an adaptation according to the user’s way of exploring the collection. Following a user-centered design approach with focus on usability, a prototype system has been created by iteratively alternating between development and evaluation phases. The results of an extensive user study including gaze analysis using an eye-tracker prove that the proposed interface is helpful while at the same time being easy and intuitive to use.
BibTeX:
@inproceedings{cmmrext2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {MusicGalaxy: A Multi-focus Zoomable Interface for Multi-facet Exploration of Music Collections},
  editor = {Ystad, S{\o}lvi and Aramaki, Mitsuko and Kronland-Martinet, Richard and Jensen, Kristoffer},
  booktitle = {Exploring Music Contents},
  address = {Berlin / Heidelberg},
  publisher = {Springer Verlag},
  year = {2011},
  series = {LNCS},
  volume = {6684},
  pages = {273--302},
  note = {extended paper for post-proceedings of 7th International Symposium on Computer Music Modeling and Retrieval (CMMR'10)},
  doi = {10.1007/978-3-642-23126-1_18}
}
2010
Abstract: Sometimes users of a multimedia retrieval system are not able to explicitly state their information need. They rather want to browse a collection in order to get an overview and to discover interesting content. In previous work, we have presented a novel interface implementing a fish-eye-based approach for browsing high-dimensional multimedia data that has been projected onto display space. The impact of projection errors is alleviated by introducing an adaptive non-linear multi-focus zoom lens. This work describes the evaluation of our approach in a user study where participants are asked to solve an exploratory image retrieval task using the SpringLens interface. As a baseline, the usability of the interface is compared to a common pan-and-zoom-based interface. The results of a survey and the analysis of recorded screencasts and eye tracking data are presented.
BibTeX:
@inproceedings{nordichi2010stober,
  author = {Sebastian Stober and Christian Hentschel and Andreas N\"{u}rnberger},
  title = {Evaluation of Adaptive SpringLens - A Multi-focus Interface for Exploring Multimedia Collections},
  booktitle = {Proceedings of 6th Nordic Conference on Human-Computer Interaction (NordiCHI'10)},
  address = {Reykjavik, Iceland},
  month = {Oct},
  year = {2010},
  pages = {785--788},
  doi = {10.1145/1868914.1869029}
}
Abstract: Aspects of individualization have so far been only a minor issue of research in the field of Music Information Retrieval (MIR). Often, it is assumed that all users of a MIR system compare music in the same (objective) manner. In order to ease access to steadily growing music collections, MIR systems should however be able adapt to their users: E.g., an adaptive structuring of a collection becomes intuitively understandable and user-adaptive genre labels more mean- ingful. In the first part of this talk, the general concept of adaptive systems is ex- plained briefly. Afterwards, several approaches for incorporating adaptivity into MIR systems that are covered in the PhD project are pointed out. The second part of the talk focuses on MusicGalaxy — an adaptive visualization technique for the exploration of large music collections.
BibTeX:
@misc{dday2010stober,
  author = {Sebastian Stober},
  title = {Adaptive User-Centered Organization of Music Archives},
  month = {Jul},
  year = {2010},
  howpublished = {Talk at Doktorandentag, Faculty of Computer Science, Otto-von-Guericke-University Magdeburg}
}
Abstract: Sometimes users of a music retrieval system are not able to explicitly state what they are looking for. They rather want to browse a collection in order to get an overview and to discover interesting content. A common approach for browsing a collection relies on a similarity-preserving projection of objects (tracks, albums or artists) onto the (typically two-dimensional) display space. Inevitably, this implicates the use of dimension reduction techniques that cannot always preserve neighborhood and thus introduce distortions of the similarity space. MusicGalaxy is an interface for exploring large music collections (on the track level) using a galaxy metaphor that addresses the problem of distorted neighborhoods. Furthermore, the interface allows to adapt the underlying similarity measure to the user’s way of comparing tracks by weighting different facets of music similarity.
BibTeX:
@inproceedings{ismir2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {MusicGalaxy - An Adaptive User-Interface for Exploratory Music Retrieval},
  booktitle = {11th International Conference on Music Information Retrieval (ISMIR'10) - Late Breaking Demo Papers},
  address = {Utrecht, Netherlands},
  month = {Aug},
  year = {2010},
  url = {http://ismir2010.ismir.net/proceedings/late-breaking-demo-08.pdf}
}
Abstract: Visualization by projection or automatic structuring is one means to ease access to document collections, be it for exploration or organization. Of even greater help would be a presentation that adapts to the user’s individual way of structuring, which would be intuitively understandable. Meanwhile, several approaches have been proposed that try to support a user in this interactive organization and retrieval task. However, the evaluation of such approaches is still cumbersome and is usually done by expensive user studies. Therefore, we propose a framework for evaluation that simulates different kinds of structuring behavior of users, in order to evaluate the quality of the underlying adaptation algorithms.
BibTeX:
@inproceedings{simint2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Automatic Evaluation of User Adaptive Interfaces for Information Organization and Exploration},
  booktitle = {SIGIR Workshop on the Simulation of Interaction (SimInt'10)},
  address = {Geneva, Switzerland},
  month = {Jul},
  year = {2010},
  pages = {33--34},
  url = {http://www.mansci.uwaterloo.ca/~msmucker/publications/simint10proceedings.pdf}
}
Abstract: Sometimes users of a music retrieval system are not able to explicitly state what they are looking for. They rather want to browse a collection in order to get an overview and to discover interesting content. A common approach for browsing a collection relies on a similarity-preserving projection of objects (tracks, albums or artists) onto the (typically two-dimensional) display space. Inevitably, this implicates the use of dimension reduction techniques that cannot always preserve neighborhood and thus introduce distortions of the similarity space. This paper describes ongoing work on MusicGalaxy — an interactive user-interface based on an adaptive non-linear multi-focus zoom lens that alleviates the impact of projection distortions. Furthermore, the interface allows manipulation of the neighborhoods as well as the projection by weighting different facets of music similarity. This way the visualization can be adapted to the user’s way of exploring the collection. Apart from the current interface prototype, findings from early evaluations are presented.
BibTeX:
@inproceedings{smc2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {{MusicGalaxy} - An Adaptive User-Interface for Exploratory Music Retrieval},
  booktitle = {Proceedings of 7th Sound and Music Computing Conference (SMC'10)},
  address = {Barcelona, Spain},
  month = {Jul},
  year = {2010},
  pages = {382--389}
}
Abstract: Viele Ansätze zur Visualisierung einer Musiksammlung basieren auf Techniken, bei denen Objekte (Musikstücke, Alben oder Künstler) aus einem hochdimensionalen Merkmalsraum für die Darstellung in den 2- oder 3-dimensionalen Raum projiziert werden. Dabei kommt es zwangsläufig zu Verzerrungen der Abstände. Als Folge kann es vorkommen, dass benachbarte Objekte sich gar nicht so sehr ähneln, wie es die Darstellung vermuten lässt, oder weit von einander entfernte Objekte sehr ähnlich sind. In diesem Beitrag wird eine interaktive Visualisierung vorgestellt, die eine globale Sicht auf eine Musiksammlung ermöglicht und mit adaptiven Filterfunktionen und multifokalem Zoom die beschriebenen Verzerrungsprobleme gezielt adressiert.
BibTeX:
@inproceedings{daga2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Visualisierung von gro{\ss}en Musiksammlungen unter Ber\"{u}cksichtigung projektionsbedingter Verzerrungen},
  booktitle = {36. Jahrestagung f\"{u}r Akustik DAGA 2010, Berlin},
  address = {Berlin, Germany},
  month = {Mar},
  publisher = {German Acoustical Society (DEGA)},
  year = {2010},
  pages = {571--572},
  note = {in German}
}
Abstract: A common way to support exploratory music retrieval scenarios is to give an overview using a neighborhood-preserving projection of the collection onto two dimensions. However, neighborhood cannot always be preserved in the projection because of the dimensionality reduction. Furthermore, there is usually more than one way to look at a music collection and therefore different projections might be required depending on the current task and the user’s interests. We describe an adaptive zoomable interface for exploration that addresses both problems: It makes use of a complex non-linear multi-focal zoom lens that exploits the distorted neighborhood relations introduced by the projection. We further introduce the concept of facet distances representing different aspects of music similarity. Given user-specific weightings of these aspects, the system can adapt to the user’s way of exploring the collection by manipulation of the neighborhoods as well as the projection.
BibTeX:
@inproceedings{cmmr2010stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {A Multi-Focus Zoomable Interface for Multi-Facet Exploration of Music Collections},
  booktitle = {Proceedings of 7th International Symposium on Computer Music Modeling and Retrieval (CMMR'10)},
  address = {Malaga, Spain},
  month = {Jun},
  year = {2010},
  pages = {339--354}
}
Abstract: Sometimes it is not possible for a user to state a retrieval goal explicitly a priori. One common way to support such exploratory retrieval scenarios is to give an overview using a neighborhood-preserving projection of the collection onto two dimensions. However, neighborhood cannot always be preserved in the projection because of the dimensionality reduction. Further, there is usually more than one way to look at a collection of images — and diversity grows with the number of features that can be extracted. We describe an adaptive zoomable interface for exploration that addresses both problems: It makes use of a complex non-linear multi-focal zoom lens that exploits the distorted neighborhood relations introduced by the projection. We further introduce the concept of facet distances representing different aspects of image similarity. Given user-specific weightings of these aspects, the system can adapt to the user’s way of exploring the collection by manipulation of the neighborhoods as well as the projection.
BibTeX:
@inproceedings{wcci2010stober,
  author = {Sebastian Stober and Christian Hentschel and Andreas N\"{u}rnberger},
  title = {Multi-Facet Exploration of Image Collections with an Adaptive Multi-Focus Zoomable Interface},
  booktitle = {Proceedings of 2010 IEEE World Congress on Computational Intelligence (WCCI'10)},
  address = {Barcelona, Spain},
  month = {Jul},
  year = {2010},
  pages = {2780--2787},
  doi = {10.1109/IJCNN.2010.5596747}
}
Abstract: We present a prototype system for organization and exploration of music archives that adapts to the user’s way of structuring music collections. Initially, a growing self-organizing map is induced that clusters the music collection. The user has then the possibility to change the location of songs on the map by simple drag-and-drop actions. Each movement of a song causes a change in the underlying similarity measure based on a quadratic optimization scheme. As a result, the location of other songs is modified as well. Experiments simulating user interaction with the system show, that in this stepwise adaption the similarity measure indeed converges to one that captures how the user compares songs. This utimately leads to an individually adapted presentation that is intuitively understandable to the user and thus eases access to the database.
BibTeX:
@inproceedings{amr08stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Towards User-Adaptive Structuring and Organization of Music Collections},
  editor = {Marcin Detyniecki and Ulrich Leiner and Andreas N\"{u}rnberger},
  booktitle = {Adaptive Multimedia Retrieval. Identifying, Summarizing, and Recommending Image and Music. 6th International Workshop, AMR 2008, Berlin, Germany, June 26-27, 2008. Revised Selected Papers},
  address = {Heidelberg / Berlin},
  publisher = {Springer Verlag},
  year = {2010},
  series = {LNCS},
  volume = {5811},
  pages = {53--65},
  doi = {10.1007/978-3-642-14758-6_5}
}
2009
BibTeX:
@proceedings{lsas2009,,
  title = {{Proceedings of the 3rd Workshop on Learning the Semantics of Audio Signals (LSAS)}},
  editor = {Stephan Baumann and Juan Jos\'{e} Burred and Andreas N\"{u}rnberger and Sebastian Stober},
  address = {Graz, Austria},
  month = {Dec},
  year = {2009},
  isbn = {978-3-940961-38-9},
  url = {http://lsas2009.dke-research.de/proceedings/lsas2009proceedings.pdf}
}
Abstract: In order to enrich music information retrieval applications with information about a user’s listening habits, it is possible to automatically record a large variety of information about the listening context. However, recording such information may violate the user’s privacy. This paper presents and discusses the results of a survey that has been conducted to assess the acceptance of listening context logging.
BibTeX:
@inproceedings{lsas2009stoberSteinbrecherNuernberger,
  author = {Sebastian Stober and Matthias Steinbrecher and Andreas N\"{u}rnberger},
  title = {A Survey on the Acceptance of Listening Context Logging for MIR Applications},
  editor = {Stephan Baumann and Juan Jos\'{e} Burred and Andreas N\"{u}rnberger and Sebastian Stober},
  booktitle = {Proceedings of the 3rd Workshop on Learning the Semantics of Audio Signals (LSAS)},
  address = {Graz, Austria},
  month = {Dec},
  year = {2009},
  pages = {45--57},
  url = {http://lsas2009.dke-research.de/proceedings/lsas2009stoberSteinbrecherNuernberger.pdf}
}
Abstract: In folk song research, appropriate similarity measures can be of great help, e.g. for classification of new tunes. Several measures have been developed so far. However, a particular musicological way of classifying songs is usually not directly reflected by just a single one of these measures. We show how a weighted linear combination of different basic similarity measures can be automatically adapted to a specific retrieval task by learning this metric based on a special type of constraints. Further, we describe how these constraints are derived from information provided by experts. In experiments on a folk song database, we show that the proposed approach outperforms the underlying basic similarity measures and study the effect of different levels of adaptation on the performance of the retrieval system.
BibTeX:
@inproceedings{ismir09stober,
  author = {Korinna Bade and J\"{o}rg Garbers and Sebastian Stober and Frans Wiering and Andreas N\"{u}rnberger},
  title = {Supporting Folk-Song Research by Automatic Metric Learning and Ranking},
  booktitle = {Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR'09)},
  address = {Kobe, Japan},
  month = {Oct},
  year = {2009},
  pages = {741--746},
  url = {http://ismir2009.ismir.net//proceedings/OS9-3.pdf}
}
Abstract: Keeping one’s personal music collections well organized can be a very tedious task. Fortunately, today, many popular music players (such as AmaroK or iTunes) have an integrated library function that can automatically rename and tag music files and sort them into subdirectories. However, their common approach to stick with some hierarchy of genre, artist name, and album title barely represents the way a user would structure his collection manually. When it comes to organizing a music collection according to a user-specific hierarchy, three things are required: First, the music files have to be described by appropriate features beyond simple meta-tags. This includes content-based analysis but also incorporation of external information sources such as the web. Second, knowledge about the user’s structuring preferences must be available. And third, and most importantly, methods for learning personalized hierarchies that can integrate this knowledge are needed. We propose for this task a hierarchical constraint based clustering approach that can weight the importance of different features according to the user perceived similarity. A hierarchy based on this similarity measure reflects a user’s view on the collection.
BibTeX:
@inproceedings{daga09stober,
  author = {Korinna Bade and Andreas N\"{u}rnberger and Sebastian Stober},
  title = {Everything in its right place? Learning a user's view of a music collection},
  booktitle = {Proceedings of NAG/DAGA 2009, International Conference on Acoustics, Rotterdam},
  address = {Berlin, Germany},
  publisher = {German Acoustical Society (DEGA)},
  year = {2009},
  pages = {344--347}
}
Abstract: Automatic structuring is one means to ease access to large music collections — be it for organisation or exploration. The AUCOMA project (Adaptive User-Centered Organization of Music Archives) aims to find ways to make such a structuring intuitively understandable to a user through automatic adaptation.This article describes the motivation of the project, discusses related work in the field of music information retrieval and presents first project results.
BibTeX:
@article{ki09stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {User-Adaptive Music Information Retrieval},
  journal = {KI},
  year = {2009},
  volume = {23},
  number = {2},
  pages = {54-57},
  url = {http://www.kuenstliche-intelligenz.de/index.php?id=7778&tx_ki_pi1[showUid]=1800&cHash=5143a324cc}
}
2008
Abstract: This paper aims to motivate and demonstrate how widely available environmental data can be exploited to allow organization, structuring and exploration of music collections by personal listening contexts. We describe a logging plug-in for music players that automatically records data about the listening context and discuss possible extensions for more sophisticated context logging. Based on data collected in a small user experiment, we show how data mining techniques can be applied to reveal common usage patterns. Further, a prototype user interface based on elastic lists for browsing by listening context is presented.
BibTeX:
@inproceedings{lsas08stober,
  author = {Valentin Laube and Christian Moewes and Sebastian Stober},
  title = {Browsing Music by Usage Context},
  editor = {Juan J. Burred and Andreas N\"{u}rnberger and Geoffroy Peeters and Sebastian Stober},
  booktitle = {Proceedings of the 2nd Workshop on Learning the Semantics of Audio Signals (LSAS)},
  address = {Paris, France},
  month = {June},
  publisher = {IRCAM},
  year = {2008},
  pages = {19--29},
  url = {http://lsas2008.dke-research.de/proceedings/lsas2008_p19-29_LaubeMoewesStober.pdf}
}
BibTeX:
@proceedings{lsas2008,,
  title = {{Proceedings of the 2nd Workshop on Learning the Semantics of Audio Signals (LSAS)}},
  editor = {Juan J. Burred and Andreas N\"{u}rnberger and Geoffroy Peeters and Sebastian Stober},
  address = {Paris, France},
  month = {June},
  publisher = {IRCAM},
  year = {2008},
  isbn = {978-3-9804874-7-4},
  url = {http://lsas2008.dke-research.de/proceedings/lsas2008_proceedings.pdf}
}
Abstract: The chord progression of a song is an important high-level feature which enables indexing as well as deeper analysis of musical recordings. Different approaches to chord recognition have been suggested in the past. Though their performance increased, still significant error rates seem to be unavoidable. One way to improve accuracy is to try to correct possible misclassifications. In this paper, we propose a post-processing method based on considerations of musical harmony, assuming that the pool of chords used in a song is limited and that strong oscillations of chords are uncommon. We show that exploiting (uncertain) knowledge about the chord-distribution in a chord’s neighbourhood can significantly improve chord detection accuracy by evaluating our proposed post-processing method for three baseline classifiers on two early Beatles albums.
BibTeX:
@inproceedings{cbmi08stober,
  author = {Johannes Reinhard and Sebastian Stober and Andreas N\"{u}rnberger},
  title = {Enhancing Chord Classification through Neighbourhood Histograms},
  booktitle = {Proceedings of the 6th International Workshop on Content-Based Multimedia Indexing (CBMI 2008)},
  address = {London, UK},
  year = {2008},
  pages = {33--40},
  url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4564924},
  doi = {10.1109/CBMI.2008.4564924}
}
Abstract: Automatische Strukturierung kann den Zugriff auf Musikarchive, speziell die Exploration und Organisation, wesentlich erleichtern. Noch hilfreicher wäre eine Darstellung, die sich an die Art und Weise, wie der Nutzer Musiksammlungen strukturiert, anpasst und somit für ihn intuitiv nachvollziehbar ist. Wir stellen hier ein prototypisches System vor, welches ein personalisiertes Ähnlichkeitsmaß anhand der Nutzerinteraktion mit einer Musiksammlung lernt. Zunächst wird dazu eine wachsende Selbstorganisierende Karte (SOM) trainiert, die nutzerunabhängig ähnliche Musikstücke gruppiert. Der Nutzer kann anschließend die Position von Musikstücken in der Karte durch einfache Drag&Drop-Aktionen verändern. Jede Bewegung verursacht eine automatische Anpassung des der Karte zugrunde liegenden Ähnlichkeitsmaßes, wodurch auch andere Stücke ihre Position ändern können.
BibTeX:
@inproceedings{daga08stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {{AUCOMA - Adaptive Nutzerzentrierte Organisation von Musikarchiven}},
  editor = {Ute Jekosch and R\"{u}diger Hoffmann},
  booktitle = {Fortschritte der Akustik: Plenarvortr\"{a}ge und Fachbeitr\"{a}ge der 34. Deutschen Jahrestagung f\"{u}r Akustik DAGA 2008, Dresden},
  address = {Berlin, Germany},
  month = {Mar},
  publisher = {German Acoustical Society (DEGA)},
  year = {2008},
  pages = {547--548},
  note = {in German}
}
Abstract: Recent approaches in Automatic Image Annotation (AIA) try to combine the expressiveness of natural language queries with approaches to minimize the manual effort for image annotation. The main idea is to infer the annotations of unseen images using a smallset of manually annotated training examples. However, typically these approaches suffer from low correlation between the globallyassigned annotations and the local features used to obtain annotations automatically. In this paper we propose a frameworkto support image annotations based on a visual dictionary that is created automatically using a set of locally annotated trainingimages. We designed a segmentation and annotation interface to allow for easy annotation of the traing data. In order to providea framework that is easily extendable and reusable we make broad use of the MPEG-7 standard.
BibTeX:
@inproceedings{amr07hentschel,
  author = {Christian Hentschel and Sebastian Stober and Andreas N\"{u}rnberger and Marcin Detyniecki},
  title = {Automatic Image Annotation Using a Visual Dictionary Based on Reliable Image Segmentation},
  editor = {Marcin Detyniecki and Andreas N\"{u}rnberger},
  booktitle = {Adaptive Multimedial Retrieval: Retrieval, User, and Semantics. 5th International Workshop, AMR 2007, Paris, France, July 5-6, 2007, Revised Selected Papers},
  address = {Heidelberg / Berlin},
  publisher = {Springer Verlag},
  year = {2008},
  series = {LNCS},
  volume = {4918},
  pages = {45--56},
  doi = {10.1007/978-3-540-79860-6_4}
}
Abstract: Automatic structuring is one means to ease access to document collections, be it for organization or for exploration. Of even greater help would be a presentation that adapts to the user’s way of structuring and thus is intuitively understandable. We extend an existing user-adaptive prototype system that is based on a growing self-organizing map and that learns a feature weighting scheme from a user’s interaction with the system resulting in a personalized similarity measure. The proposed approach for adapting the feature weights targets certain problems of previously used heuristics. The revised adaptation method is based on quadratic optimization and thus we are able to pose certain contraints on the derived weighting scheme. Moreover, thus it is guaranteed that an optimal weighting scheme is found if one exists. The proposed approach is evaluated by simulating user interaction with the system on two text datasets: one artificial data set that is used to analyze the performance for different user types and a real world data set – a subset of the banksearch dataset – containing additional class information.
BibTeX:
@inproceedings{amr07stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {User Modelling for Interactive User-Adaptive Collection Structuring},
  editor = {Marcin Detyniecki and Andreas N\"{u}rnberger},
  booktitle = {Adaptive Multimedial Retrieval: Retrieval, User, and Semantics. 5th International Workshop, AMR 2007, Paris, France, July 5-6, 2007, Revised Selected Papers},
  address = {Heidelberg / Berlin},
  publisher = {Springer Verlag},
  year = {2008},
  series = {LNCS},
  volume = {4918},
  pages = {95-108},
  doi = {10.1007/978-3-540-79860-6_8}
}
2007
Abstract: Current work on Query-by-Singing/Humming (QBSH) focusses mainly on databases that contain MIDI files. Here, we present an approach that works on real audio recordings that bring up additional challenges. To tackle the problem of extracting the melody of the lead vocals from recordings, we introduce a method inspired by the popular “karaoke effect” exploiting information about the spatial arrangement of voices and instruments in the stereo mix. The extracted signal time series are aggregated into symbolic strings preserving the local approximated values of a feature and revealing higher-level context patterns. This allows distance measures for string pattern matching to be applied in the matching process. A series of experiments are conducted to assess the discrimination and robustness of this representation. They show that the proposed approach provides a viable baseline for further development and point out several possibilities for improvement.
BibTeX:
@inproceedings{ismir07qbsh,
  author = {Alexander Duda and Andreas N\"{u}rnberger and Sebastian Stober},
  title = {Towards Query by Singing/Humming on Audio Databases},
  editor = {Simon Dixon and David Bainbridge and Rainer Typke},
  booktitle = {Proceedings of the 8th International Conference on Music Information Retrieval, ISMIR 2007},
  address = {Vienna, Austria},
  month = {September},
  publisher = {\"{O}CG},
  year = {2007},
  pages = {331-334},
  url = {http://ismir2007.ismir.net/proceedings/ISMIR2007_p331_duda.pdf}
}
Abstract: Most of the currently existing image retrieval systems make use of either low-level features or semantic (textual) annotations. A combined usage during annotation and retrieval is rarely attempted. In this paper, we propose a standardized annotation framework that integrates semantic and feature based information about the content of images. The presented approach is based on the MPEG-7 standard with some minor extensions. The proposed annotation system SAFIRE (Semantic Annotation Framework for Image REtrieval) enables the combined use of low-level features and annotations that can be assigned to arbitrary hierarchically organized image segments. Besides the framework itself, we discuss query formalisms required for this unified retrieval approach.
BibTeX:
@inproceedings{amr06safire,
  author = {Christian Hentschel and Andreas N\"{u}rnberger and Ingo Schmitt and Sebastian Stober},
  title = {{SAFIRE: Towards Standardized Semantic Rich Image Annotation}},
  editor = {Marcin Detyniecki and Andreas N\"{u}rnberger and Eric Bruno and Stephane Marchand-Maillet},
  booktitle = {Adaptive Multimedia Retrieval: User, Context, and Feedback. 4th International Workshop, AMR 2006, Geneva, Switzerland, July, 27-28, 2006, Revised Selected Papers},
  address = {Berlin / Heidelberg},
  publisher = {Springer Verlag},
  year = {2007},
  series = {LNCS},
  volume = {4398},
  pages = {12--27},
  doi = {10.1007/978-3-540-71545-0_2}
}
2006
BibTeX:
@proceedings{lsas2006,,
  title = {{Proceedings of the 1st Workshop on Learning the Semantics of Audio Signals (LSAS)}},
  editor = {Pedro Cano and Andreas N\"{u}rnberger and Sebastian Stober and George Tzanetakis},
  address = {Athens, Greece},
  month = {Dec},
  year = {2006},
  url = {http://irgroup.cs.uni-magdeburg.de/lsas2006/proceedings/LSAS06_Full.pdf}
}
Abstract: We have developed the system DAWN (direction anticipation in web navigation) that learns navigational patterns to help users navigating through the world wide web. In this paper, we present the prediction model and the algorithm for link recommendation of this system. Besides this main focus, we briefly outline the system architecture and further motivate the purpose of such a system and the approach taken. A first evaluation on real-world data gave promising results.
BibTeX:
@inproceedings{kes2006stober,
  author = {Sebastian Stober and Andreas N\"{u}rnberger},
  title = {{DAWN -- A System for Context-Based Link Recommendation in Web Navigation}},
  editor = {Bogdan Gabrys and Robert J. Howlett and Lakhmi C. Jain},
  booktitle = {Knowledge-Based Intelligent Information and Engineering Systems},
  address = {Berlin / Heidelberg},
  month = {Oct},
  publisher = {Springer Verlag},
  year = {2006},
  series = {LNAI},
  volume = {4251},
  pages = {763--770},
  isbn = {3-540-46535-9}
}
Abstract: In this paper, we present the system DAWN (direction anticipation in web navigation) that helps users to navigate through the world wide web. Firstly, the purpose of such a system and the approach taken are motivated. We then point out relations to other approaches, describe the system and outline the underlying prediction model. Evaluation on real world data gave promising results.
BibTeX:
@inproceedings{ah2006stober,
  author = {Sebastian Stober and Andreas N\"urnberger},
  title = {{Context-Based Navigational Support in Hypermedia}},
  editor = {Barry Smyth and Helen Ashman and Vincent Wade},
  booktitle = {Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems (AH 2006)},
  address = {Berlin / Heidelberg},
  month = {Jun},
  publisher = {Springer Verlag},
  year = {2006},
  series = {LNCS},
  volume = {4018},
  pages = {328 -- 332},
  doi = {10.1007/11768012_43}
}
Abstract: Searching the Web and other local resources has become an every day task for almost everybody. However, the currently available tools for searching still provide only very limited support with respect to categorization and visualization of search results as well as personalization. In this paper, we present a system for searching that can be used by an end user and also by researchers in order to develop and evaluate a variety of methods to support a user in searching. The CARSA system provides a very flexible architecture based on web services and XML. This includes the use of different search engines, categorization methods, visualization techniques, and user interfaces. The user has complete control about the features used. This system therefore provides a platform for evaluating the usefulness of different retrieval support methods and their combination.
BibTeX:
@inproceedings{amr05carsa,
  author = {Korinna Bade and Ernesto William {De Luca} and Andreas N\"{u}rnberger and Sebastian Stober},
  title = {{CARSA - An Architecture for the Development of Context Adaptive Retrieval Systems}},
  editor = {Keith van Rijsbergen and Andreas N\"{u}rnberger and Joemon M. Jose and Marcin Detyniecki},
  booktitle = {Adaptive Multimedia Retrieval: User, Context, and Feedback. 3rd International Workshop, AMR 2005, Glasgow, UK, July 28-29, 2005, Revised Selected Papers},
  address = {Berlin / Heidelberg},
  month = {Feb},
  publisher = {Springer Verlag},
  year = {2006},
  series = {LNCS},
  volume = {3877},
  pages = {91 -- 101},
  doi = {10.1007/11670834_8}
}
2005
Abstract: Im stetig wachsenden und sich verändernden Datenmeer des World Wide Web sind Nutzer bei der Navigation von Webseite zu Webseite weitestgehend auf sich allein gestellt. In dieser Arbeit wird ein Ansatz vorgestellt, mit Hilfe dessen vorhergesagt werden kann, ob ein Link von einem Benutzer wahrscheinlich weiterverfolgt werden wird. Diese Vorhersagen ermöglichen es, bestimmte Links besonders hervorzuheben, wodurch ein Benutzer bei der Navigation unterstützt werden kann.
Das implementierte Verfahren ist clientseitig einsetzbar und damit in seiner Anwendung nicht auf bestimmte Bereiche des World Wide Web beschränkt. Zur Vorhersage wird ein Markov Modell höherer Ordnung aus aufgezeichneten Browsingpfaden gelernt. Ein Browsingpfad wird dabei in eine Folge von Kontexten zerlegt, wobei jeder Kontext als Dokumentenvektor mit TF/iDF-Gewichten repräsentiert wird und beispielsweise dem Text einer Webseite oder eines Absatzes entspricht. Die Menge der Kontexte wird geclustert, wodurch Browsingpfade zu Navigationsmustern abstrahiert werden und sich die Größe des daraus gelernten Modells reduziert. Zum Lernen des Modells wurde ein von Borges und Levene für den serverseitigen Einsatz entwickelter Algorithmus erweitert und auf die clientseitige Anwendung übertragen. Die Vorhersage von Links erfolgt schließlich durch ein speziell entwickeltes Verfahren, das für einen Browsingpfad die gleichzeitige Betrachtung mehrerer ähnlicher Navigationsmuster im Modell erlaubt. Das gesamte Verfahren ist parametrisiert. Der Einfluß der verschiedenen Parameter und die Qualität der Vorhersagen konnten jedoch nur auf einer kleinen Datensammlung untersucht werden, wodurch nur ein grundlegender Eindruck von der Funktionsweise des Systems vermittelt werden kann.
Das System ist in ein Framework zur clientseitigen Aufzeichnung und Analyse von Benutzeraktionen beim Browsen eingebettet, welches ebenfalls im Kontext dieser Arbeit entwickelt wurde. Dieses Framework ist ein eigenständiges und erweiterbares System, welches auch für andere Arbeiten verwendet und nach den jeweiligen Anforderungen leicht erweitert werden kann.
BibTeX:
@mastersthesis{stober05diploma,
  author = {Sebastian Stober},
  title = {Kontextbasierte Navigationsunterst\"{u}tzung mit Markov-Modellen},
  type = {Diploma Thesis},
  address = {Magdeburg, Germany},
  month = {Dec},
  school = {Otto-von-Guericke-University},
  year = {2005},
  note = {in German}
}
Abstract: This report refers to work completed during my internship with the Mechatronics Research Group at the department of Mechanical and Manufacturing Engineering at the University of Melbourne, Australia from September 5th, 2003 until March 5th, 2004.
Recognition of three-dimensional objects in two-dimensional images is a key area of research in computer vision. One approach is to save multiple 2D views instead of a 3D object representation thus reducing the problem to a 2D to 2D matching problem. The Mechatronics Research Group is developing a novel system that focuses on artificial objects and further reduces the 2D views to symbolic descriptions. These descriptions are based on shape-primitives: ellipses, rectangles and isosceles triangles. Evidence insupport of a hypothesis for a certain object classification is collected through an active vision approach.
This work deals with the design and implementation of a data structure that is capable of holding such a symbolic representation and an algorithm for comparison and matching. The chosen symbolic representation of an object view is rotation-, scaling- and translation-invariant. For the comparison and matching of two object views a branch & bound algorithm based on problem specific heuristics is used. Furthermore, a GA-based generalization operator is proposed to reduce the number of object views in the system database.
Experiments show that the query performance scales linearly with the size of the database. For a database containing 10000 entries, a response time of less than a second is expected on an average system.
BibTeX:
@mastersthesis{stober05internship,
  author = {Sebastian Stober},
  title = {Design and Implementation of an Algorithm and Data Structure for Matching of Geometric Primitives in Visual Object Classification},
  type = {Internship Report},
  address = {Magdeburg, Germany},
  month = {Apr},
  school = {Otto-von-Guericke-University},
  year = {2005}
}