For those of you who’ve been listening to the show for a while, it is fairly obvious that there is, quite literally, a ton of data out there related to development initiatives and humanitarian assistance. If you had the time, money and desire, you could find data about almost any aspect assistance: things like baseline data about a population, damage assessments, geospatial data, demographics of the people affected by a crisis, or things like which organizations, governments and companies are on the ground helping. The problem is, in the humanitarian sector, organizations don’t have the time, money and people power to hunt down this data. And, even more of a problem is the fact that the data is locked in spreadsheets on individual laptops, only captured in written notes or, unfortunately, kept hidden as a potential competitive advantage. Sarah Telford, my guest for the 129th episode of the Terms of Reference Podcast, is on a mission to change all of this. She is the Chief of Data Services at the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), and oversees the continuing development of a global open data platform called the Humanitarian Data Exchange. The goal of HDX is to make humanitarian data easy to find and use for analysis, and, as of July 2014, has been accessed by users in over 200 countries and territories.
IN TOR 129 YOU’LL LEARN ABOUT
- The extent of the effort that must take place to aggregate humanitarian data, from large institutional sources, research teams and ground activity.
- The many steps to make data useful: gathering and collecting, cleaning, converting, validating, unifying…
- The fully open, no strings attached approach of HDX to share their stock of information, and the steps that must be taken to guarantee its financial viability.
- The role HDX played in the 2014 West Africa Ebola epidemic and the role the epidemic played on HDX going forward.
- Details on the selection and validation of data, and the promotion of data keeping standards.
- The role community building plays in guaranteeing quality, relevance and usefulness of data.
OUR CONVERSATION FEATURES THE FOLLOWING
- Humanitarian Innovation Fund (HIF)
- HIF’s Humanitarian Data Exchange (HDX)
- World Bank’s GeoNode
- United Nations
- HIF Elhra’s Journey to Scale
- HBO’s Westworld
- Techonomy Conference
- Data collection and aggregation
- Data cleaning and validation
- Data visualization
- Machine-readable data
- Artificial Intelligence
- Internet of Things
- Vulnerability Assessment Mapping
- User driven design
- Data spaces
- Community building
- Data file formats: PDF, spreadsheets
- Data anonymization
- 2014 West Africa Ebola crisis
- Cash as better way of giving
- Wetchester, New York
- Nairobi, Kenya
- The Hague, The Netherlands
EPISODE CRIB NOTESThe problem with data Unstandardized Scattered in spreadsheets everywhere Often outdated 02:35 Chief of Data Wizardry at OCHA “There are a lot of hierarchies on UN” “Titles don’t fully reflect the job” HDX collects crisis data from several organizations Launches in the German summer of 2014 HDX plea is to make data available and useful Humanitarian Innovation Fund started the support, others have been joined Data is distributed, as it is the sector. Many organization scattered “There is no command and control” OCHA is a lighthouse more than a panopticon Data is used for one-time issues, generating “data mass graves” While designing for crisis situation has informed the design, the origin of HDX is academic World Bank is working with GeoNodes that collects spatial and other data Interesting, but it is not maintained. Most local governments don’t prioritize this To make something sustainable is to develop the architecture of maintenance Metadata is important! WB understood this, but thinking about data right at a time of crisis tends chaotic easily: gathering, unifying, wrangling and dewrangling… “It’s all too common. We can do better. It should be easier”. Basic multi-purpose streams (geography, population) are difficult also And going deep into the community, there was frustration aplenty 14:10 Diplomatic exchange of data gifts among aid organizations People can visit HDX and access data, no sign-up needed To contribute with data, the user must contact on behalf of an organization “Individuals don’t collect data” (…) Organizations sizes vary from large players to university research teams Submitting data from an organization is the first quality filter To their surprise, organizations are not usually tidy on their data. Submitters are often ‘data activists’ from within, who take charge of making it usable in addition to their contractual duties Some steps are taken to validate, clear and anonymize the data HDX receives 19:52 True stories about how HDX made a difference “We have an idea of what gets viewed and downloaded, which sets are more popular, but we do not follow on how the data is used” An item with high regard are the HDX visualizations HDX played a key role in the 2014 Ebola crisis. They had records of infections and casualties. HDX made sure it was machine-readable data The Ebola crisis data is by far the most popular set, and researchers still downloaded today to study the epidemics, response, etc “Ebola put us on the map”. Since then other HDX initiatives have followed on this experience 24:38 It’s the simple things Stephen: The big impact of sharing data in XLS instead of PDF A DataLab in Nairobi performs data collection duties from 40 agencies. They asked HDX to help. First discovery: Data was stored in PDF Furthermore, it was not standardized, hence it did not lend itself for comparison “Big Data is not our focus, but smaller spreadsheets with key crisis response information” Which does not preclude algorithmic efforts to link datasets and visualize correlations, even from different organizations Data cleaning and validation are perhaps the most critical problems for HDX today 30:27 5-year qualitative forecast Data spaces, starting in The Hague. An idea risen from a conference Connecting all the levels, from head to field including all decision-making levels Realizing not all humanitarian people is data people is important May the new generations be more data literate. For the time being though “The best way to guarantee quality and comparability of data across organizations and context, is community building” Get people engaged around the story of data Interaction and collaboration efforts will allow “to tackle problems we were incapable to before” 34:05 Do or do not with a solution in mind, there is no try “We understood the problem” before delving into it And there was investment in researchers and design thinking Designers went to the ground in Africa and Colombia “User driven” was an assumption that validated itself Data pipelines must be joined with organization model around a product “Fun!” Users were interviewed about HDX personality: mature? playful? lumberjack? This has help to overcome the hurdle of popularity Small armies of trainers on HDX go everywhere to instruct on its use User research will still be pushed as HDX missionary activity. Data cleaning too (taking care of the data basement makes the whole house work) It’s all about creating the process (technical, logistical, organizational) that best informs decisions through data 41:38 Traditional and innovative HDX funding “We don’t ever want to charge people for our service. I have never seen it work in this field” Value-added products can be set up, like workshops Some solutions by request. Or funders have a voice on how products should be done 44:09 Where Sarah gets her data “Reading about everything” OpenStreetMap Vulnerabiliy Assessment Mapping HIF Elhra’s Journey to Scale “There’s so much” Westworld. “To an extent, HDF is an intelligence, performs automated tasks on our behalf” Techonomy conference. This year: IoT “The biggest disruption is cash” contributions, over in specie. With digital, we can track what people does with the money
Please share, participate and leave feedback below!If you have any feedback you’d like to share for me or Sarah, please leave your thoughts in the comment section below! I read all of them and will definitely take part in the conversation. If you have any questions you’d like to ask me directly, head on over to the Ask Stephen section. Don’t be shy! Every question is important and I answer every single one. And, if you truly enjoyed this episode and want to make sure others know about it, please share it now:
[feather_share show=”facebook, twitter, linkedin, google_plus” hide=”reddit, pinterest, tumblr, mail”]Also, ratings and reviews on iTunes are very helpful. Please take a moment to leave an honest review for The TOR Podcast!