Tech and Tools

All the gizmos you're thinking of (and more)

Saws and chisels hang on a wall cast in blue light

By now, you've probably recognized the fact that DUG isn't a guide for people who want to be data journalists. There are plenty of those kinds of guides out there in the world, and a lot of them are very good. Instead, this guide is for organizations interested in building out data journalism capacity. For these organizations, though, it's still a good idea to know the kinds of skills and tools you'll come across while recruiting talent -- and the kind of stuff in which you might want folks to be proficient. (Professional development!) To that end, this section offers a brief overview of some of the most common tools that data journalists tend to turn to.

Most of the software cited below is freely available, and most of it that isn't can be attained freely with a professional society membership. (You can get Tableau Desktop with an IRE membership, for example.)

In the third column, we've also highlighted our favorite tutorial of the relevant software. Inevitably, there are many others!

🤠 Data wrangling and analysis

Name Description Tutorial
Python / Jupyter General-purpose programming language and interactive browser-based coding notebook First Python Notebook
R / RStudio Statistical programming language and desktop coding environment R for Journalists
Google Sheets It's Google Sheets. It's useful! GIJN
Google Colab Like Jupyter, but with free cloud GPUs Welcome to Colaboratory
Beautiful Soup Web-scraping library for Python Datacamp
Selenium Web-browser automation tool Art of Testing
Workbench No-code scraping and analysis platform Intro to Data Journalism

📊 Data visualization

Name Description Tutorial
ggplot2 R plotting library ggplot2 book
matplotlib Python plotting library Real Python
D3.js JavaScript library often leveraged for interactive charts Dashing D3
Observable Like Jupyter, but for JavaScript / D3 Observable Tutorials
Tableau Drag-and-drop visualization tool Coursera
Adobe Illustrator Professional finishing software Udemy
Flourish Easy no-code embeddable charts Data Visualization with Flourish
Datawrapper Easier no-code embeddable charts Datawrapper Academy
RAWGraphs Easiest no-code embeddable charts? Reasonable people will disagree! Documentation for v2.0 forthcoming...
GreenSock Web animation toolkit GreenSock Learning Center

📡 Infrastructure

Name Description Tutorial
HTML / CSS Web markup languages Coursera
JavaScript Web programming language JavaScript for Journalists
React JavaScript library for interface development Awesome React
Vue Another JS library for interface development Vue Mastery
Rails Web app development framework Getting Started with Rails
Django Python web framework Django Girls
Amazon Web Services Cloud computing platform Coursera
Node.js / npm JavaScript runtime environment Node.js Tutorial for Beginners

And: Darn-near everyone should know git and/or GitHub! We're not trying to be pedantic here. We've just heard again and again -- and found to be true in our own experience -- that version control is the saving grace of data projects. We like this git tutorial.

But software won't get you all of the way there. You also need organizational processes and pipelines! We're not going to spend too much time on this last bit, because the preferred pipeline in question is going to vary as a function of the preferences on your team, but the point is: While every data project is different, they can often be understood in terms of a common project-management framework. And lots of smart people out there in the world have tried to formalize these types of frameworks to make your lives easier! We'll highlight the AP's DataKit as the crown jewel of this kind of thinking. But lots of other publications make their data templates and page generators freely available on GitHub! Consider:

Open-source for the win!

⬅ Structure | Value ➡