Below is a list of questions to serve as a starting framework for the discussion in this thread:
- How can human rights organizations utilizing data visualizations ensure the protection of sensitive data and populations?
- When are organizations exposing people to risk and how can those risks be mitigated or eliminated?
- With no graphic designer/data visualization producer/visual journalist on staff, how can we create content that will have an impact?
- How do we expand what we count as data?
- With limited quantitative data available, how do we analyze and visualize qualitative data?
On the question of impact, a good first step for both good design and good advocacy is to consider your primary and secondary audiences. Who are you ultimately trying to reach to make a change? And who has influence on them? Once you define your target audiences you can start to drill into their motivations: what do they care about? What do they fear? How can you best reach them? How can you help them take the actions you want? How can you mobilize your allies to reach them? Etc.
With target audiences in mind one can start to make decisions about messaging and framing, format and style, the level of simplicity vs complexity, etc. Some audiences will be best reached by a weighty report, others by an image in a Tweet, in other cases these compliment each other.
There are free and cheap tools available for producing data graphics, though most tools do have a learning curve. Tools can also leave their own imprint on graphics, doing some things more easily than others. Tools can reveal unexpected patterns in data, so it's a good idea to sketch a little and visually play your graphics before settling on a final form. Sketching on paper is also a good way to explore creative possibilities of graphics without the limitations of particular tools.
Finally, once you have developed your data graphics, it's a good idea to test them out, not just with sympathetic staff and allies, but with folks whose profile approximates your target audience. Some online tools can help you get quick feedback, too.
Depending on the story you want to tell you might choose to visualize your data in a exploratory or in an explanatory way. Most standard charts and graphs can be well used for both cases.
Exploratory visualizations encourage the users to explore the data and find their own patterns and correlations and to draw their own conclusions. Headers and text can help or guide the user. These tools can be highly interactive and animatable.
In an explanatory visualaztion, the publisher highlights specific pieces of data/correlations/etc. and leaves less room for the user to play around with the data. The tools can still be interactive, but usually has a stronger pin-pointed focus compared to an exploratory visualization. The visualization can also be static charts/graphs/images or animations.
This is connected to my first post in the "strategic use" section. Organizations need to access analytical expertise just as much as they need design expertise or else they run the risk of publishing a beautiful infographic with biased data or naive analysis. A lot of questions about the data itself (who collected it, how did they collect it, defintions and limitations) and the analysis (normalization, data manipulation, statistics used) are essential to work through.
There are lots of free and low cost tools to turn a spreadsheet into a graph but this is a skill that is not tool-dependent. This is expertise that takes some time and training to develop - it can't be learned overnight. And even the largest organizations often don't have an analyst on staff. There isn't a clear answer for this challenge. One area for organizations to explore is in partnerships or collaboration with academics. There are many disciplines, from the social sciences to public health to the hard sciences, where academics have the quantitative methods expertise to do the vetting that many organizations need.
Not all vizualizations are of the type that need to be vetted, but if a research report is reliant on data analysis, it is essential to ensure the analysis is correct.
I'd love to ask those with more human rights experience than I have about how you take precautions to protect the data of vulnerable or marginalized populations. One powerful experience I had with this came through a story from a Tohono O'odham (Native American) activist I worked on a project with. Their land straddles the US-Mexico border. She detailed how the federal government, in building the US-Mexico fence, said to the O'odham "just give us a list of your sacred burial sites and we will try our best not to destroy them". However, the O'odham are bound by their laws to never share this information with outsiders and they couldn't give that information away. So there are some types of data that it may be inappropriate to even collect in the first place, particularly when there are imbalances of power or cultural differences.
In other cases, such as sexual violence, reporting can be dangerous for the survivor and there are many incentives not to speak up.
And a further other case is around the way that minorities can be identified even in very large datasets. I've written a little bit about this in relation to transgender identified people. Since they are a small subset of the larger population it can be much easier to re-identify who they are which puts them at risk.
How do individuals and organizations in the human rights space deal with this? I'd love to hear any examples, stories, & best practices.
Thanks for brining this up. In my experience, this has been a long and old debate that has finally gained momentum since the revelations of unchecked, mass surveillance by Edward Snowden. I believe his whistle-blowing efforts helped validate among the global general public the notion that digital information or data is not only sensitive but also of great interest to law enforcement, organized crime and government intelligence apparatuses. In that context, most of the advice I have given or seen given by others is related to reducing the data collected; anonymize records at collection when possible; protect information using end-to-end, client-side properly implemented encryption; as well as to look at how academia or the Red Cross/Red Crescent have created standard (yet insufficient) practices for collection, storage and access of data. That said, I have seen very little about DataViz per se, and I think that could be an interesting contribution to human rights practice.
In terms of resources, the Responsible Data Forum may be a good place to start as it summarizes a number of conversations that the human rights and development community have sustained over time. The url is https://responsibledata.io/
I remember attending a meeting on data privacy & data security in ICT4D organized by the UN Global Pulse. The report of that meeting may be of interest as well. There is a blog post about the meeting (including a link to the report) here http://www.unglobalpulse.org/blog/improving-data-privacy-data-security-i...
I'm sure there are other resources, and perhaps others can chime in as well.
Most of the answer to the general questions around concerns and considerations is given above by my co-leads in this dialogue. That said, I think that we may be missing is around remediation. If the use of data and its visualization proves to be detrimental to a human rights subject post-publication, what are the right things to do? Of course errata and other editorial/publishing practices could apply, but what to do if a visualization is beyond misleading and could be linked to harm? There is an increasing dialogue and debate around protecting vulnerable populations data, but I perceive this mostly around the safety and security of the data. Do any of you have pointers around dataviz related remediation specifically?
This is a very good question and I don't have an answer. One of the things I talked about in a blog post about feminist data visualization was around how data visualizations are not good at supporting dissent. How do you "talk back" to data? Typically the interactions that visualizations support are around different views of the data, but not around collaborative interpretations (or challenging dominant interpretations, or pointing out possible harms that could result). At one point, the platform ManyEyes had a discussion feature built into it so that users could discuss others' visualizations. That platform is now gone, but it is interesting to think about where that kind of design could have evolved to today if it were still around.
For highly explorative visualization tools, I think it would be a great feature to allow the user to save "parameter settings" and share the visualization state, together with a comment. For custom designed & developed tools this should not be a hard thing to implement, and I think it would serve as a great platform for discussion.
It is much easier to share and refer to a "saved state" which the next visitor quickly can view and explore, rather than describing the findings and parameter settings in words only.
Does anyone know of a tool that does this? I am going for something similar in my own news visualization, where the user can easily share saved filters, e.g. news on Trump http://www.ekokammaren.se/?s=trump categorized by political agenda.
A quantitative analysis of word & phrase frequencies in qualitative data can often be an interesting complement to qualitative analysis and coding. Two of the Databasic.io tools that we have developed - called WordCounter and SameDiff - introduce some of the concepts and vocabulary of quantitative text analysis for beginning learners and you can upload your own documents to see the results.
To go further and deeper - There's a great tool called Overview developed originally by a journalist that helps in quantitative analysis of large document sets. Originally it was developed for the use case of largescale document dumps - the idea being that some sets of documents are unreadable/uninterpretable because of their sheer size (Iraq War Logs, Hillary Clinton's emails, etc), so quantitative text analysis can be really interesting to point to patterns in those cases.
A next step to take after counting words is to perform sentiment analisys, e.g. what opinions and moods are expressed in the text? Is the author angry, happy, violent, etc.? Basically, a neural network AI is implemented and trained with data, and learns and improves by time.
Here are a couple of sentiment analisys visualization examples: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/tweet_app/ and the more playful and beautiful http://wefeelfine.org/
John, thanks for sharing this. The report is excellent and provides a really good background for conversations like the one we are part of. Kudos to al involved.
I'm wondering if there are examples of data visualization from violations using the Martus database that you could share that have shown trends or other information for advocacy. The use of the Martus database for documenting human rights violations also concerns protection of information and vulnerable populations.
It has been a while since I left Benetech and the Martus team. As far as i remember, data visualization was one of those things that were important for several users. Back then it was mostly simple graphs and charts, but things may have changed since then. See http://www.benetech.org/2015/05/06/announcing-martus-desktop-5-1-improve... for the most updated information I am aware of. See the image attached, extracted from the 5.1 version user manual to get a sense of what is possible with Martus.
I have found also an interesting doc that is new to me and that I came across when looking for some references to answer your question. It is part of the Proceedings of the Association for Information Science and Technology. This is the URL http://onlinelibrary.wiley.com/doi/10.1002/meet.2014.14505101069/full
Thanks so much for taking the time to find and share this information. I found the link to the usability study of the new version of Martus adding visualization capability especially interesting. This is a critical component for visualizing data.
Do others have experiences or ideas you can share about what you've learned regarding the visualization tools you use?
This discussion about protecting the data of vulnerable populations as you visualize data is extremely important, and I’ve found my colleagues’ posts incredibly insightful. I want to make explicit an issue that has been inherent in the posts but not a focus so far: the importance of empowerment and participation for vulnerable communities in collecting, analyzing, and visualizing data about themselves. Catherine’s discussion of non-binary gender data discusses the importance of self-determination for transgender communities and others targeted for discrimination and exclusion. Such self-determination is a human right, and it will become more important in the coming years, as the Sustainable Development Goals (SDGs) become driving factors in government and donor decisions about what data to collect, analyze, and visualize.
In the rush to “leave no one behind,” there is a tendency for development actors to call for more data about marginalized groups and more disaggregation by various axes of discrimination. While there are some resources that call attention to self-identification and other rights-related data issues (see this helpful resource from the UN human rights office on human rights on rights principles relevant to data), sometimes there is insufficient attention to the principle of self-determination and what that requires on the ground. The potential for re-identification is especially dangerous here, in contexts where populations are sometimes not only marginalized but also may be criminalized (e.g., with respect to sex workers or LGBTQ communities). In some contexts, communities will become legible to the State, and to the international public, for the first time in efforts to fill “blank spaces” in development data. That fact needs a lot more attention and care from human rights advocates—but it’s not the topic of this discussion.
To bring us back to the issue of the visualization of data, I want to underscore the importance of participation and self-determination in such work. Here I will give a shout out to Digital Democracy (disclosure: I serve on their Board) as a group that has worked hand in hand with marginalized communities to empower communities with methods for self-determination in collecting, analyzing, and visualizing their data. Here is a video about DD’s work with the Wapichana people in Guyana—who, with DD’s help, built their own drone to visualize and map their most precious resource—their land. This map was not only more accurate and detailed than anything collected by outsiders, it is also an important tool for monitoring and advocacy. Another example is DD’s offline mapping app Mapeo, which has been used by indigenous communities in the Amazon to visualize their community, its resources and even its origin stories.
Thank you, Meg, for such excellent examples. I'm particularly struck by the blank spaces link that you sent (and look forward to watching this video). And I want to think further about this tension between data collection (by powerful agencies, often not local to the people whose data is being collected) and education/empowerment/self-determination through data. Do others' see this as a tension in development & advocacy work? From my (admittedly limited) observations, large organizations often have good intentions in mind, but mobilize data collection and data-driven decision making in an extractive way. Which is to say, going far away, collecting data on people & their lives, and then making decisions about resource allocation from armchairs in Brussels, Washington or wherever.
How could we build data-driven programs that engage in more robust co-design/education/data literacy efforts such that we could support the kind of self-determination Meg is talking about? I guess this is a little off the topic of data visualization but I see visualization/storytelling as a really key piece to any kind of data literacy efforts.