Information: Collecting and cleaning data

8 posts / 0 new
Last post
Information: Collecting and cleaning data

Welcome to the discussion! We thought we'd start this conversation by talking about the data behind the graphics. In this discussion, we're asking you to share your experience and advice in accessing, collecting and using data. We also encourage you to ask questions to this experienced group of conversation leaders! Consider these questions below when sharing your comments in this discussion topic:

  • Where to find existing data
  • How to collect new data
  • How to clean and analyze data
  • Share examples, guides and other resources that would be helpful for defenders in understanding data.

Share your thoughts, experiences, questions, challenges and ideas by replying to the comments below.

For help on how to participate in this conversation, please visit these online instructions. New feature: you can now add images and video to your comments!

General considerations to go from data to advocacy

I'm glad to be part of this dialogue and I'm looking forward to learning from and engage with all of you. Before I dive into more technical aspects of cleaning and collecting data, I'd like to share some general considerations that I have found useful when visualizing data for human rights in projects like A Costly Move: Far and Frequent Transfers Impede Hearings for Immigrant Detainees in the United States and Troops in Contact: Airstrikes and Civilian Deaths in Afghanistan.

  • Information, especially understood as evidence and facts, is one of the most important assets for successful human rights advocacy. While visualization can be seen as an independent tactic on its own, it is important to note that for meaningful impact all work needs to be anchored in a larger methodological framework aligned with the issue we want to advocate for or against. 
  • In general, I'd suggest to keep a clear set of goals from the beginning, strong enough to be used as a constant reference yet sufficently flexible to allow for the integration of what we discover during the exploratory data-dives. While this may sound obvious, I have found useful to ask and document questions like What do we want to change? Who are the actors that can impact this issue? or How can we better explain the key recommendations to promote change? during the early stages of a project or campaign.
  • Collecting, analyzing, and presenting data are all tasks that require access to domain expertise, and I find critical to have access to individuals that can validate and verify assumptions, strategy, and final products. I have also learned that is important to keep in mind that human rights information is eminently qualitative, and that it is hard to come across statistical data that is sufficiently complete to prove a point on its own.
  • If you build sufficient time for the project, relying on Freedom Of Information (FOI) mechanisms for precise and useful information can contribute in great manner. As for data collection, consider to also include non-official sources like leaked information. I use Scribd and Pastebin to search for potentially relevant data sets and bits. In the context of human rights work, be skeptical of social media data, more often than not it is incomplete, biased, and misleading. 


When the data analysis outcome is different than expected

Thanks, Enrique! Lots of great points.

One point that grabbed me was: In general, I'd suggest to keep a clear set of goals from the beginning, strong enough to be used as a constant reference yet sufficently flexible to allow for the integration of what we discover during the exploratory data-dives.

I wonder if you and the other participants have ever started a data analysis project that had outcomes you weren't expecting, that impacted the strategy for your campaign. I'm curious to hear your examples of this kind of situation, if they exist. Thanks!

-- Kristin Antin, New Tactics Online Community Builder

Collecting alternative data - networks & communication

Hi all,

Greetings - I just joined this conversation on an ad hoc basis today, and being quite new to this specific area of visualising information I feel I have already learned a lot by reading the previous posts.

My question is not directly related to previous posts but I wonder if some of you have some ideas/answers. One of the recurring questions on data collection/presentation in my job is: How do you engage in data collection from "alternative" (ie not online) information sources at community level, and then make this information relevant for decision making processes from the national to global level? I mean to focus not so much on verifying data (though that is another aspect) but on challenges with collecting information from personal sources on a regular basis, and "translating" this information into input at policymaking levels.

I work for an international network of locally based NGOs, and one of our main goals is to feed local perspectives on the prevention of violent conflicts into international strategies addressing them. One challenge we encouter is gathering relevant data from the ground, from information sources based in civil society, and presenting it to international stakeholders clearly and on a regular basis. We believe that the alternative information from the ground that our network members have is essential to the effective prevention of violent conflicts worldwide but we struggle in collecting and channeling that information through efficiently. I believe some of the challenges are more related to the nature of networks (eg communication, different languagues) but would love to hear if there are tools that can be adapted to, or help navigate, this issue!

Gesa Bent, Global Partnership for the Prevention of Armed Conflict

Participatory research to engage & empower communities

Great questions, Gena, and thanks for sharing your data challenges here. Makes me think that we should host an online conversation focused specifically on collecting data (there are so many aspects to it - like the use of ICTs, organizing data, cleaning data, securing data, types of data collection like surveys).

Regarding data collection from community-level sources, the first thing that came to my mind is participatory research. We hosted an online conversation on this topic and we have a case study about how a participatory research process was carried out in Southeast Asia not only to document and understand how free trade was affecting small scale food producers in Malaysia, Philippines, Thailand, Vietnam, Indonesia, Burma, Cambodia and Laos but also as an effective means to inform and engage producers themselves in the process and issue. Being able to engage the community in the research process and building the capacity for this kind of research can be very powerful. I hope these are helpful!

There are also some examples of how ICTs are being used to collect information from communities in our conversation summary on Empowering communities with technology tools to protect children.

Regarding your question about how to make the information collected relevant for decision makers - I hope that question is addressed in this week's discussion topics:

I hope others chime in, too!

- Kristin Antin, New Tactics Online Community Builder

Using participatory research to advance children’s rights

Here is another example: Using participatory research to advance children’s social and economic rights.

Wona Sanana was established in 1999 to protect children’s rights by compiling information on the condition of the children of Mozambique after the 16-year civil war. The project combined data-collection on the welfare of children with community education to empower local people to take action and to promote improved policies addressing children’s rights. Through participatory research, communities learned about the problems facing their children and were encouraged to develop unique responses appropriate to the needs or their community.


Data sources

I'll be listing here some examples of data sources:

One of my favorites, the Global Database of Events, Language, and Tone (GDELT) is a resource that could be great for contextualization. 

Data visualization for immigration advocacy

These are notes from the November 14, 2012, Technology Salon NYC (TSNYC) where we discussed data visualization for immigration advocacy. There are several interesting things that Linda Raftree, the organizer, summarized on that post that may be of use.