There is a lot of information on the web and there's a lot of talk about using this so-called "Big Data" to improve journalism.

That is exactly what we are trying to do with Points Mentioned. It's a tool for newspapers to automatically create interactive, explanatory content from their own big data.

High level vision

To do this we start by structuring the news around the named entities that appear in the stories: the "who, what, where, and when" are tagged and indexed into tables.

Technicalities

We create the big data from our client's new stories using natural language processing to identify proper names (aka named entities) and storing information about them in a database. To be specific, we use three primary tables:

1.  a named entity table who's key is the named entity;

2.  a story table who's key is the URL of the story; and

3.  a table who's key is the combination of both named entity and URL, capturing two pieces of information.  First, all the stories a given entity appears in.  Second, all the named entities in a given story.

By structuring the data in these tables, we can create fast, unique products that use data mining to aggregate relevant information to enrich other stories that mention these entities.  This provides background on the story's characters and highlight their connections to the community.

So, why join Points Mentioned?

Our publisher clients benefit from increased reader engagement and more targeted advertising opportunities within our products.  Our next release of explanatory widgets will bring Points Mentioned to the next level.  The scope will be expanded beyond location to include all named entities in the story.  Readers will be able to tap into a network of relationships between individuals, organizations, locations, and events in your community. As they engage with content that is interesting to them, you'll be able to offer advertisers a more targeted advertising platform with user data.