The Definitive Guide to Do Data Science for Good

by Tobias Pfaff | 8 min read

You are a fully-equipped (or aspiring) data scientist and want to use your precious skills for solving problems that really itch the world? Welcome to the club. The good news is that there are many ways for data scientists to do good. However, the path is not always beaten and you might need to show some initiative.  This article will give you some insight on how you can get involved, either through group meetings and events, as a volunteer or in paid positions.

data science for good

Source: flickr

Getting started — online data science competitions

A good place to start (without even having to leave your couch!) are online data science competitions. These competitions allow you to sharpen your skills and to get familiar with different problem types before you get actually involved.
The home of data science competitions certainly is Kaggle. Watch out for competitions that tackle social problems. Examples are the diabetic retinopathy detection competition or the Africa soil property prediction challenge.
DrivenData is a rather new competition platform that focuses solely on social challenges. This makes it a perfect place to test your skills while doing good.
Occasionally, you will find other data science for good competitions. The IBM Big Data for Social Good Challenge was one of them (but beware, you are not free in the choice of tools here).

Another great way to get started is to replicate one of the projects in our #openimpact shortlist (magic ball icon = predictive analytics inside!)

Group meetings and events

A good opportunity to mingle with like-minded folks in person is attending (or starting) a meetup. The following table lists data science meetups around the world with a focus on social good:

Name Creation year Members Past events Location
Data for Good - Data Scientists & Devs doing GOOD 2012 662 13 Toronto
DataKind NYC 2012 2043 22 New York
DataKind UK 2013 1289 9 London
Data for Good - Calgary 2013 357 14 Calgary
Data for Good Montreal - Data Scientists & Devs doing GOOD 2013 140 1 Montréal
DataKind Dublin 2013 483 15 Dublin
Brussels Data Science Meetup 2014 1280 35 Brussels
DataKind DC 2014 617 5 Washington
DataKind SG 2014 717 9 Singapore
DataKind Bangalore 2014 502 7 Bangalore
Data for Good 2014 588 2 Paris

Source: Own compilation. Numbers are retrieved dynamically from meetup.com.

You should also keep your eyes and ears open for dedicated hackathons. An example from the past is the Thorn hackathon in San Francisco. Or the Bayes Impact hackathon which happens annually (also in San Francisco).

If you want to attend a conference on the topic I’d say Do Good Data in Chicago is the best bet as the leading conference on data, research, and analytics for social sector professionals. Data on Purpose in Stanford is another option. Also watch out for dedicated themes or tracks at other conferences (like KDD 2014).

Volunteering

DataKind is a true pioneer in the field and does a phenomenal job of getting volunteers excited about harnessing the power of data science in the service of humanity. If you live close to one of the DataKind Chapters, you can attend their meetups and further engage in the following ways:

  1. Attend a DataDive:
    DataDives are weekend-long, marathon-style events where dozens of volunteers rally together to help 3-4 social change organizations do initial data analysis, exploration, and prototyping. These events are free for organizations, open to volunteers of all skill levels and take place around the world.
  1. Be among the ones selected into a DataCorps:
    DataCorps is DataKind’s signature program that brings together teams of pro bono data scientists with social change organizations on long-term projects that use data science to transform their work and their sector. These projects last between one to six months and are structured so that volunteers can work in their spare time.

DataKind also hosts a neat “Data Do-Gooding Calendar”.

Another bigger volunteer group is Statistics Without Borders. You can join the 1000+ members and provide pro-bono statistical consulting to organizations and government agencies. The focus is on developing nations that do not have the resources for statistical services.

Do you live in Brazil? Then you might want to check out Data4Good. This initiative works on creating a network of volunteers, produces content to educate around the usage of data for social good (mostly infographics) and provides consulting services for social organizations (more about Data4Good in this blog post).

What if you are not so much into meetups, or if you are living on a remote farm and all you have is a cat, an internet connection and “The Elements of Statistical Learning”?

Well, one thing you can do is look for job descriptions for skilled data volunteers on LinkedIn. However, at the time of writing I got 0 results for “volunteer data scientist” and 1 result for “volunteer data analyst”. However, if “volunteer data entry” is what you are looking for, then there is plenty to do.

If LinkedIn doesn’t get you hooked up with an exciting problem, you should check out the Digital Humanitarian Network. They leverage digital networks for humanitarian response to crises or disasters. It took me a bit to understand their “activation facilitation process”, but it’s a great idea (this diagram helps). You can volunteer through their member organizations who provide data science and coding tasks of different complexity (check out this diagram to see the members’ services).

Some people are even thinking about virtual marketplaces that match up non-profits, local governments or disaster responders with volunteer data scientists. In the same vein, we are currently thinking how we can match up parties on datalook.io. On the one hand non-profit organizations or government agencies who see a project on DataLook and think that it can be replicated to solve their own problem, but don’t have the necessary skills in-house. And on the other hand local or remote data scientists who would be interested in helping to realize the project. If you think this is a great idea or want to discuss this with us, please get in touch.

You see that there are quite a few opportunities for volunteering in the field. But what if you need some dough to pay the bills?

Paid jobs (temporary / part-time)

Such positions are usually organized as fellowships. The most prolific fellowship in the field is probably the Data Science for Social Good Fellowship at the University of Chicago. It was started in 2013 and is run as a 3-month summer program where fellows working in small teams partner with non-profits and local government authorities to tackle socially relevant problems using data science. The fellowship is sponsored by the Eric and Wendy Schmidt Foundation. [$11-16k, 12 weeks]

The program has a smaller sibling in Atlanta: Data Science for Social Good Atlanta. The summer internship program was launched in 2014. Students in the program work as paid interns on projects coming from the City of Atlanta and local non-profits. [$8k, 10 weeks] Recently, the University of Washington also launched a Data Science for Social Good Summer Program.

If you are a college student in the New York City area, you are then eligible for the Microsoft Research Data Science Summer School. In the past, students taking part in the summer school have worked on NYC related challenges. [$5k, 8 weeks]

Code for America fellows are usually web/app developers, but a few of the fellows are data scientists working on problems in different U.S. cities. [$50k, 11 months]

All these fellowships are run by organizations that partner with non-profits and the government. There are also non-profits that offer their own fellowship. An example is the Thorn Innovation Lab where data scientists help fight child sexual exploitation. [$100k, 1 year]

Apart from fellowships you might become what I call a “data angel”, a full-time data scientist working at a company that partners with a non-profit. You help the non-profit for a limited time while receiving your salary from your company. Some companies that offer such Corporate Social Responsibility programs are Pivotal, Teradata, Cloudera, Palantir, Splunk, Tableau, and Informatica.

If your company wants to establish such a CSR program in Germany, get in touch with us.

Paid jobs (permanent / full-time)

DataKind announced in 2014 it would create a full-time, in-house Data Science Team for Good in New York City. Their first data scientist was hired in early 2015 (see here) and you should check out DataKind’s careers page for upcoming positions. Sometimes, “Data for Good” job openings in general are also tweeted via @DataKind.

Bayes Impact is a Y Combinator backed non-profit in San Francisco. They launched in 2014 and their approach is to take on a few large projects at a time rather than spreading their resources across many smaller projects. Their vision is to build operational data science solutions for large-scale problems that affect millions of people. Project partners are large NGOs and the federal government. Bayes Impact is always looking for big-hearted data scientists, data engineers and software engineers. You can apply here.

As non-profits are beginning to understand that data science can help them achieve their goals, a few of them have already created full-time positions for data scientists. An example is Crisis Text Line. Or look for data science positions in certified B-corporations like change.org.

The government sector too begins to slowly open positions for data scientists. On the local level, the team of the Mayor’s Office of Data Analytics in New York City has achieved some impressive impact with their projects. On the federal level, the White House recently appointed Data Science veteran DJ Patil as U.S. Chief Data Scientist.

You might also want to look for jobs in for-profit companies whose mission is to use cutting-edge data science to solve pressing societal problems. An example is Enlitic in San Francisco who want to revolutionize diagnostic healthcare with deep learning. Or Edgeflip in Chicago who want to enable non-profits and issue-based groups to better reach their online communities using data science. You should also have a look at consultancies like Real Impact Analytics (Brussels), SocialCops (New Delhi) or Civis Analytics (Chicago) that have interesting social good projects in their portfolios. But these are just examples and there are many more out there.

And then there is of course the vast field of academia and science with opportunities to apply your data science skills to the greater good. Fields that produce enormous amounts of data like astronomy have a huge demand for data scientists. Check out an article by Jake Vanderplas for elaborate thoughts about data science in academia.

Off the beaten path

From my German perspective it seems like the vast majority of occasions to apply your skills for social good are in the U.S. I became interested in the field in 2013 and I didn’t find an organization in my city that allowed me to use my skills for social good. Instead of giving up I tried to convince a federal authority to use predictive analytics for prioritizing food security inspections. That didn’t work and then I founded DataLook. Through DataLook, I’m now in touch with a lot of people in Germany and abroad who share my interests.  It’s a long way and we are still looking for non-profits and government agencies as project partners to realize projects (get in touch!). However, I hope that this article helps some of you get connected with existing initiatives – or to start your own and leave the beaten path in order to do what you want to do: use data science to tackle real problems.

Let me know in the comments if I missed any relevant meetups, fellowships, etc. And follow us on Twitter.


tobias

Tobias Pfaff
Tobias is the founder of DataLook.

This article has benefited from comments by Katharine Bierce, Christian Bracher, Quentin Dumont, Daniel Kirsch, Eric Liu and Miriam Young.


Edit history:

  • Changed change.org to certified-B corporation (not a non-profit) [comment from Sirius]
  • Added DataKind SF Bay Area meetup to table [comment from Miriam Young]
  • Added link to virtual marketplace for disaster responders [comment from Roxanne More]
  • Added the Do Good Data conference [comment from Andrew Means]
  • Added Statistics without Borders [comment from dolllar]
  • Added DSSG program at University of Washington
  • Added Data on Purpose
Share this article:
Tweet about this on TwitterShare on Google+Share on FacebookShare on LinkedInPin on PinterestBuffer this page

You might also like:

  • Sirus

    Nice article, but “non-profit” has a strict definition and shouldn’t be used sloppily. Change.org (and the others for all I know) is for profit.

    • Thanks for the hint! Will change that.

  • dolllar

    Include Statistics Without Borders in your consideration.

    • Great hint! Missed that and is now added.

  • 3kv

    Meetup Group called Africa Open Data that seems to always get actual leaders into their events in Africa and elsewhere: http://www.meetup.com/Africa-Open-Data/

  • http://hackoregon.com Has regular meetups for several different data science projects for social good

    • Fantastic! The table above draws data from the Meetup API. Which meetup is it specifically?

      • It’s not a meetup.com its just a weekly meetup 😉 See http://hackoregon.org. Several meetup.com orgs (Portland Python User Group, PDX Data Engineers, Portland Data Science, etc) host presentations by Hack Oregon volunteers. And hackoregon.org hosts its own meetings at facilities donated by generous partners like OMSI (Oregon museum of science and industry).

  • And http://totalgood.com/blog/invitation-for-machine-intelligence-grant-proposals-winter-2016/ is supporting volunteer data scientists in Oregon with money, computer resources, and mentors. (full disclosure: I’m a founder of the nonprofit TotalGood)

  • Rob Davidson

    Don’t miss Data for Good Ottawa, founded Nov. 2013 w/ 271 members