In our first attempt in this area, we explored the use of machine-learning techniques for identifying stories in segments of conversational speech, using the words recognized with commercial speech-recognition software [Gordon, 05b]. We followed a traditional text classification approach, where a corpus of transcribed conversational speech was first hand-annotated (story / non-story) for use as training and testing data. By developing a clear definition of what counted as a story, our annotators were able to achieve reasonably high inter-rater agreement. Segments of training data were then encoded as high-dimensional feature vectors (word-level unigram and bigram frequency counts) and used to train a naïve Bayes binary classifier. To apply this classifier to test data, overlapping consecutive segments of test data were individually assigned to either the story or non-story class, with confidence values smoothed across segments using a simple mean-average smoothing function. Performance evaluations of our approach yielded low precision (39.6%) and low recall (25.3%), equal to random chance performance on this task. However, we observed substantially higher performance when using transcribed test data (as opposed to the output of a speech recognition system), with reasonable precision (53.0%) and recall (62.9%) scores. We concluded that significant advances in open- domain continuous speech recognition would be required in order to construct a usable story-capture system for casual conversations.
Given the low performance of story capture from speech data we decided to shift our focus to written electronic discourse, specifically weblogs. By randomly sampling weblog entries on the Internet, we found that between 14% and 17% of the text in weblog entries consisted of stories, i.e. narrative descriptions and interpretations of the author’s past experiences [Gordon, 07]. To apply our existing story extraction technology to Internet weblog entries, we created a new hand-annotated (story / non- story) corpus of weblog entries for use as training and test data, and achieved reasonable performance levels (precision = 30.2%, recall = 80.9%, F-score = 0.414). By incorporating techniques for automatically detecting sentence boundaries in the test data, utilizing a contemporary support-vector machine learning algorithm, and using a Gaussian function to smooth the confidence values, we were able to significantly improve the overall performance of this approach (precision = 46.4%, recall = 60.6%, F-score = 0.509). Although these performance levels leave some room for improvement, we believe that it is high enough to explore the integration of automated story capture technologies in productivity software applications used by organizations (including email, online forums, and general-purpose word processing).
Technologies for Story Retrieval
The second key challenge in automating the management of organizational stories is connecting people who have specific learning needs with the stories that are most relevant to these needs. This problem does not arise when a modest number of stories are collected through directed interviews as part of the development of specific training applications, where the learning objectives are known ahead of time. However, as the size of organizational story collections becomes very large, and where these stories may have relevance to unanticipated learning needs, it is critical to have some mechanism for automatically pairing available stories with new learning objectives. Previous work on this problem has focused on story indexing (e.g.