What is the Relationship Between Data Ethics and Data Science?

The two topics don't appear to have anything in common at first glance. While ethics is based on social science and philosophy, data science is tied to engineering and science. But the truth is that ethical considerations and human contexts are integral to data science.  

What is Data Ethics? Why do we care about it?

Let’s start with a professional definition of data ethics. Oxford professors and philosophers Luciano Floridi and Mariarosaria Taddeo state that:

“Data ethics is a new branch of ethics that studies and evaluates moral problems related to data (including generation, recording, curation, processing, dissemination, sharing and use), algorithms (including artificial intelligence, artificial agents, machine learning and robots) and corresponding practices (including responsible innovation, programming, hacking and professional codes), in order to formulate and support morally good solutions (e.g. right conducts or right values).”

Simply speaking, in data ethics, we learn about all the ethical problems that appear during our use of data. In this era of rapid technological development, we are living in a “Data-fied World.” Data collection is a vital part of nearly every aspect of our lives, from the phones in our pockets to the cars we drive. Almost every human behavior and every operation we do with a tool like a computer can be collected as data. Over the years, as technology progressed and we aimed for a better life, we began to use data generated from day-to-day actions to conduct complex analysis with the help of strengthened computing powers and new analytical tools. Advanced technologies related to data science, like Machine Learning and AI, have brought a lot of benefits to our life. However, as humans begin to step away from hands-on analysis and let automated machines do most of the work for us, different issues such as fairness, privacy, and representation emerge. We will cover a couple of cases about those issues in detail below, so keep reading!

Why do Data Scientists need to understand Data Ethics?

Ever since Data Science became a buzzword in the technological industry, colleges and universities have been scrambling to open a Data Science Program to satisfy the world’s growing demand for data scientists, engineers, and analysts. In 2018, the University of California, Berkeley was among the first few colleges that introduced a unique Data Science Major. The program wants to:

“Produce graduates who not only have deep technical expertise, but who also know how to responsibly collect and manage data, and use it to inform decisions and advance innovation to benefit the rapidly evolving world they’re graduating into”.

Besides various technical requirements such as computing, probability, and modeling, the Berkeley Data Science curriculum has an additional human contexts and ethics requirement. This shows how academic institutions recognize data ethics as a crucial skill for any future Data Scientist to develop. As Data Scientists, we often deal with big sets of data that are driven by people, so it is our duty to keep private data secured and use it responsibly. To better incorporate human values like justice and equity in data-driven technologies, we need to also understand the underlying human and social structures.

How should we incorporate Data Ethics in our work as students?

When we are doing a data science project, we need to make sure that we understand the potential ethical consequences of our work. Some tips for you to be an ethical data scientist are: first, be aware of privacy issues such as data breaches and find ways to adequately secure the data. Second, be transparent with your data usage. Get user consent before you use their data in any way. Third, despite the difficulty of being completely objective, you should try your best to make sure there is no bias involved in your model. In fact, to make employees follow data ethics principles, many companies and organizations have incorporated certain codes of ethics and conduct. One code of conduct that a lot of professional data scientists follow is the Oxford-Munich Code of Conduct. It addresses common ethical dilemmas that data scientists from the industry, academia, and the public sector may face.  

Some real-world cases that might blow your mind.

As the world became more technologically advanced, the use of data has brought efficiency in a variety of industries. For example, many tech companies have employed data scientists to track and understand the popularity of their products. However, Kwang-Mo Yang, a member of Samsung Medical Center, has written an article regarding the ethical concerns behind using real-world data. The problem emerges from non-governmental organizations studying the health data after de-identifying personal information. Because a patient’s health data may contain highly personal information, it is possible for the pharmaceutical companies to analyze the gender, age, and race of a patient and categorize a certain type of individual as vulnerable to a certain disease. Therefore, this is an issue of privacy and representation. Pharmaceutical companies have used this information to target advertisements for drugs. Groups of individuals who had been classified as “vulnerable to Disease A” were more likely to see advertisements for drugs targeting “Disease A” in their pharmacies. This categorization often disproportionately affected low-income communities and under-represented minorities, raising several questions about whether this practice was truly ethical. Many individuals have also shown concern about their personal information being used for commercial purposes. Ironically, many governmental organizations have utilized real-world data to invent new drugs that have helped a variety of patients, but they used lots of personal information to drive that marketing².

While it is legal in many countries, including South Korea, to study health data after de-identifying personal information, it begs the question: Would an individual be happy about their own information being used without their knowledge? Will a person feel completely secure in a society where he or she can’t hide their personal information?

To make any study more ethical, companies should acquire informed consent from their patients before they begin to use private data from an individual.

The Evolution of Data Ethics

We’ve already talked a lot about the present state of data ethics in data science. Let’s predict the future roles it may play in the industry. Writer Barbara Lawler has deduced some of the potential global trends related to data ethics and data privacy. Here are five trends that Lawler expects to see:

  1. Chief Privacy Officers can expect ethics to become an explicit part of their role.
  1. Technology companies will lead the way for U.S. Federal Privacy legislation.
  1. Sustainable ethics codes will evolve to better address the challenges of a digital world.
  1. Product excellence and privacy by design will become synonymous.
  1. Companies will drive to educate policy-makers and regulators about their technologies.

What does this mean for you?

Data Ethics is here to stay, and will likely become a key part of any responsible Data Scientist’s job, if it isn’t already.

While data has yielded a wide variety of benefits in everyday lives, the purpose behind the use of data has become a vital topic. It all begins from considering the human impact from the use of data. It will be important for privacy officers to analyze the impact on people and society and whether the impact may be positive, negative or neutral.

Since the necessity for data privacy in the United States has been a long-discussed topic and, as technology companies are the most knowledgeable organization within the area of data usage, Lawler believes that the United States will lead the way for U.S. Federal Privacy legislation following the regulation of General Data Protection Regulation that was implemented in European Union.

For a long period of time, there has been a shift in consensus on how to respect privacy due to the emergence of personal computing and larger network connection. Following the expansion of globalization of economy and profound alteration in the physical and digital lives of the citizens, Lawler is convinced that companies will come up with a sustainable ethics code to counter potential challenges in a digital world.

Privacy by design means to embed data privacy requirements into product design and development, embodying the “build it in, don’t bolt it on” mentality. This includes building in:

  • Privacy-savvy defaults
  • In-product transparency
  • Considerations for and documenting privacy risks and data flows
  • Assigning data owners upfront and throughout the data lifecycle, including E2E security

With advancements in technology, knowing where data comes from and why it exists has never been more vital from a strategic, operational, and compliance perspective. Data needs to be stored in a clean and accessible form, which will allow companies to learn, analyze, and tackle business issues in real-time. PbD (Privacy by Design) will play a critical role in this and it is just as important as secure coding.

Lawler writes that it is vital for policymakers around the world to develop a deeper understanding on what they wish to regulate. Given the profound shift in the digital network globally, policymakers must consider:

  • What harms are they trying to protect people from?
  • What rights do they want to guarantee?
  • What problems are they trying to solve?
  • What are the privacy outcomes they hope to achieve for their citizens?

Therefore, the deeper the understanding the policymakers have towards the newly created technologies, the easier it will be for them to decide if they want to regulate that technology or not. As a result, the organizations that place the greatest emphasis on educating policymakers will have the highest impact on the evolution of data science.

‍

Posted 
Feb 5, 2023
 in 
IT & Software
 category

More from 

IT & Software

 category

View All

Join Our Newsletter and Get the Latest
Posts to Your Inbox

No spam ever. Read our Privacy Policy
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.