BI and the “Unstructured Data” Challenge
Information Extraction An e-mail message is “semi-structured.”
Semi=half. What’s “structured” and what’s not? Is augmentation/tagging and entity extraction enough?
What categorization might you create from that example message?
If we extracted all the entities to a database, what could you do with them?
From semi-structured text, it’s especially easy to extract metadata.
There are many forms of s-s information...
©Alta Plana Corporation, 2008