Skip to main content

Document Mining with Overview: A Digital Tools Tutorial

About This Course

Overview is a free tool for journalists that automatically organizes a large set of documents by topic, and displays them in an interactive visualization for exploration, tagging, and reporting. Journalists have already used it to report on FOIA document dumps, emails, leaks, archives, and social media data. In fact it will work on any set of documents that is mostly text. It integrates with DocumentCloud and can import your projects, or you can upload data directly in CSV form.

You can't read 10,000 pages on deadline, but Overview can help you rapidly figure out which pages are the important ones — even if you're not sure what you're looking for.

See more tools at Poynter's home for digital tools: Try This! — Tools for Journalism.

What Will I Learn?

  • What types of documents and reporting problems Overview can and cannot help with
  • How to load your documents into Overview
  • How to use Overview's visualization tools to understand the contents of your document set
  • How to use Overview's tagging and annotation features to keep track of what you discover
  • Using Overview to find patterns across many documents

Who Should Take This Course?

Journalists, bloggers, editors, managers, producers and anyone who wants to sort quickly though hundreds, or hundreds of thousands, of documents.

Course Instructor

Jonathan Stray

Jonathan Stray

Jonathan Stray leads the Overview Project for the Associated Press, a Knight News Challenge-funded visualization system to help investigative journalists make sense of very large document sets, and teaches computational journalism at Columbia University. Formerly he was an interactive editor at the Associated Press, a freelance reporter in Hong Kong, and a senior computer scientist at Adobe Systems. He has contributed stories to The New York Times, Foreign Policy, Wired and China Daily. He has an MS in computer science from the University of Toronto and an MA in journalism from the University of Hong Kong.


The Knight Foundation

The Knight Foundation

The John S. and James L. Knight Foundation advances journalism in the digital age and invests in the vitality of communities where the Knight brothers owned newspapers. Knight Foundation focuses on projects that promote community engagement and lead to transformational change.

Training Partner

American Press Institute

American Press Institute

The American Press Institute conducts research and training, convenes thought leaders and creates tools to help chart a path for journalism in the 21st century. The organization was founded in 1946 with the mission to help the news industry fulfill the purpose of the First Amendment, sustaining a free press in the public interest. That mission continues today.

This $30 course is free thanks to the generous support of the John S. and James L. Knight Foundation.

Frequently Asked Questions

What web browser should I use?

The Open edX platform works best with current versions of Chrome, Firefox or Safari, or with Internet Explorer version 9 and above.

See our list of supported browsers for the most up-to-date information.