Medical Education

Latest News

DataKind UK Ethics Committee: Our First Year in Review

Ethics is an increasingly important topic in the area of data. From Amazon’s sexist hiring algorithm to Google’s racist photo tagging system and Northpointe’s biased recidivism scores, there are plenty of examples lately of technologies built around data that are causing harm. At DataKind UK, we want to ensure that our projects with social change organisations make responsible use of data and that our volunteer community of data scientists keep ethics front and centre in their work. Data ethics is the branch of ethics that addresses the generation, collection, sharing, and use of data. It considers how data practices respect values like privacy, fairness, and transparency, as well as the balance between individual rights and societal benefits.

The Ethics Committee’s First Year

This year DataKind UK created an Ethics Committee, adding to the four other committees that make our work possible. The Ethics Committee’s goal has been to increase awareness of the sometimes complex ethical considerations of doing data science projects. One of the very first things we decided to do was to set up a data ethics book club. The book club would offer a timely opportunity to explore topics related to data ethics in depth, through books, research papers, newspaper articles, and sometimes videos. We covered topics such as face recognition, fairness, financial inclusion, and gender among others. The book club has grown from 5 people to 50, operating in London, Edinburgh and online.

Community Perspectives and Surveys

While setting up the book club, we also ran a community survey on views relating to data ethics. We found that there’s quite a variety of views on the topic. For example, there was an almost perfect split among those that believed that data and AI will have a positive impact on society, and those that thought it wouldn’t. Opinion was also divided on whether AI is moving too fast, with over half agreeing with this. Those that work in the social/public sector lean towards the view that AI is moving at the right pace, whereas those on the private sector lean towards the view that it’s moving too fast.

Training and Resources

The next thing we did was to run ethics training for core DataKind UK volunteers. We wanted to provide guidance on how to recognise potential ethical issues early on, using EthicalOS as a basis for our discussion. We used a case study based approach drawing from examples across the industry, charities, government, and our own past projects. While embarking on this journey of diving deeper into ethics, we realised that there is no shortage of resources to learn from. As we did our own reading, we also shared what we came across, and are continuously updating this to reflect a good set of material for someone relatively new to the topic.

Challenges in Ethical Data Science

We’ve also begun to tackle some big challenges. For example, the challenge of working with free text data. With this type of data, it can be a hard (if not impossible) to guarantee that personal identifiers are stripped out of the data. Even if we’re able to anonymise the data, the resulting text can be highly sensitive information about difficult topics. The Ethics Committee agreed that using free text data shouldn’t be a hard no and should very much depend on the context. Another longstanding challenge in the area of ethics is the skewed demographic that’s attracted to ethical discussions — and who is not in the room. Even though data practitioners are mostly male, ethics activities are predominantly women.

Industry Examples of Data Ethics

Data ethics examples provide valuable insights into responsible data management. They showcase how organizations balance individual rights with societal benefits. For instance, Apple’s commitment to privacy and IBM’s AI ethics demonstrate best practices in the industry. These examples are essential for businesses looking to enhance their data governance frameworks.

Organization/Regulation Core Ethical Principles and Practices
Apple Data minimization, on-device processing, user transparency and control.
IBM Transparency and explainability in AI; removing bias from AI systems to ensure fairness.
Microsoft Accountability, transparency, and user control (view, edit, download, and delete information).
GDPR (EU) Right to know what data is collected; mandatory notification of data breaches within 72 hours.

Key Real-World Applications

  • Apple’s Commitment to Privacy: The company minimizes personal data collection and processes much of the data on the user’s device instead of in the cloud.
  • IBM’s AI Ethics: Decision-making processes of AI should be explainable and systems should be transparent.
  • Microsoft’s Data Governance: Rigorous data management through transparent privacy policies and user responsibility.
  • GDPR and Data Protection: Grants individuals the right to know why data is collected and where it is stored.