Ethics in Artificial Intelligence: the W hyai-4-all.org/wp-content/uploads/2020/01/Ethics-in-AI... · 2020-01-29 · Ethics in Artificial Intelligence: the W hy Case Study Analyses

Ethics in Artificial Intelligence: the Why Case Study Analyses and Proposed Solutions

Emma Beharry Hari Bhimaraju

Meghna Gite Elena Lukac Anika Puri Mana Vale

Ecem Yilmazhaliloglu, Students of AI4ALL

Artificial Intelligence is the future. This appears again and again in the news, at our schools, during conversations, but what percentage of the general public—and especially the youth—truly understands why? We hail from a variety of schools, countries, and backgrounds, but we are all united by our passion for Artificial Intelligence. We attended nonprofit AI4ALL’s summer educational programs, where we had the opportunity to explore an introduction to AI alongside like-minded female students. Through topic-specific lectures that taught basic AI algorithms (i.e. K-Nearest Neighbors, Decision Trees, Naïve Bayes, etc.), interdisciplinary AI research introductions from distinguished professors and industry leaders, hands-on project groups, and application-oriented lab visits, we gained an incredibly thorough exposure to all facets of the field. Our goal in writing this paper is to share our experiences as young researchers to offer our perspective on this question. There are two main questions in AI as we approach a future straight from our imagination: how and why. The former is what a majority of the scientific and technological communities are focused on. How can we use AI to create autonomous vehicles? How can doctors employ AI to transform medicine from a curative field to a predictive and preventive one? How can computer vision convert our excess of big data into systems for disaster relief or humanoid robots? With all the excitement surrounding novel technologies and the seemingly infinite potential of AI, the question of why often takes the backseat. Why are we putting so many resources into AI without truly understanding its full potential? Why are there not governmental or universal regulations on AI? Why are more people not thinking about AI’s potential failures? These questions bring in ethics: what is and will be ethical in AI? Our research paper takes the format of problem-example, case study-potential solution analyses on prominent topics in AI: Natural Language Processing, Facial Recognition, Computer Vision, Data Collection, and Gun Violence. We selected key problems that we have identified in AI today and studied popular examples of these shortcoming, discussing their ethical implications. Through our paper, which is geared towards the common public, we hope to increase AI education, literacy, and youth involvement, not just in the technological aspects of AI but in all its intersecting fields: policy, education, business, and more. AI4ALL’s motto is as follows: AI will change the world. Who will change AI? As you read this paper, keep this question in mind. 100% of the authors of this paper identify as female. 12% of AI researchers around the world follow the same demographic. This is a problem. Why is dataset bias so prominent in AI? One of the main reasons is lack of diversity in researchers creating datasets. This and other issues that plague Artificial Intelligence today are caused by the homogeneity in the field. We hope that this paper will serve as a positive step towards increased diversity in AI. We hope that it will bring the why to the forefront of the AI revolution.

Natural Language Processing

Natural Language Processing (NLP) is the subsection of artificial intelligence technology that concerns the natural human language. NLP systems aim to process, understand, and generate human language. Applications of NLP systems are ubiquitous in daily American life, from translation apps (i.e. Google Translate) to grammatical correction technologies (i.e. auto-correct). The variety of NLP algorithms has allowed the existence and development of these applications for decades.

The simplest machine learning NLP algorithm is Naïve Bayes, which calculates the conditional probability that a word or phrase has a certain meaning or is in a certain category, based upon the dataset. The Neural Networks family of machine learning algorithms has a varying range of complexity, ranging from basic Neural Networks to Deep Convolutional Neural Networks to Generative Adversarial Networks and more. Neural Networks are inspired by the human brain, and pass inputs through layers of mathematical calculations to produce an output. Lastly, the newest widely adopted NLP algorithms—word2vec, bag of words, term frequency-inverse document frequency, etc.—all involve storing words in multi-dimensional vectors, or 1xn matrices. The computer creates a mathematical understanding or definition of the word through different matrix calculations depending on the algorithm, which it then uses to create relationships between words.

Despite advances in technology, NLP algorithms are susceptible to dataset biases because they are generally supervised machine learning algorithms. All machine learning algorithms require a training dataset to observe and test predictions with. Supervised machine learning means the dataset is labelled and annotated, or it has the “correct answers”—the algorithm knows what the answers should be and must figure out how to achieve those answers. If the dataset involves language that is gendered, racist, and/or generally demeaning, the algorithm will adopt that language as its default goal or “correct answer,” encoding biases.

Even less explicitly malign dataset biases can reveal inequities. In the Stanford AI4ALL 2019 Camp, a group of students programmed a Naïve Bayes System that classified texts and tweets from Hurricane Sandy and the 2010 Haitian Earthquake, respectively, into 5 categories of relief: food, water, medical, energy, or none. The geographical differences between New Jersey and New York and Haiti, and the different types of damage incurred, skewed the global dataset.

Figure 1 – Hurricane Sandy Dataset Composition

Figure 2 – 2010 Haitian Earthquake Dataset Composition

Although biases like requests for different necessities seem inherently benign, an investigation into their creation can provide proof or evidence of inequities. For example, the common demands in New York and New Jersey were for generators, hand-crank devices, and batteries. However, it was hypothesized that because such energy resources were not widely available in the less affluent Haiti, the demand for energy materials was drastically less. In addition, more immediate actions can skew the dataset. For example, the Haitian texts were translated into English to create the algorithm, meaning

nuances of questions were inevitably lost. This was hypothesized to be a big contributor as to why over one-third of requests to a post-earthquake hotline were labelled as “None.”

In addition, the primary training set employed was the Hurricane Sandy Dataset. Thus, when evaluating the performance of the Naïve Bayes Algorithm, performance on the Hurricane Sandy Testing Set (unlabeled data the computer had not yet seen) was higher than the Haitian Earthquake Testing Set. The computer was unable to recognize Haitian requests for energy supplies and had a lower F1 score, the metric used to evaluate the performance of the algorithm.

Figure 3 – NLP Performance by Category on Hurricane Sandy Testing Set

Figure 4 – NLP Performance by Category on 2010 Haitian Earthquake Testing Set

The overuse of the Hurricane Sandy training set had intuitively biased the algorithm to perform better with Hurricane Sandy data. This bias towards one dataset comes at the expense of performance on the other, and creates an algorithm less capable of providing relief to an entire population.

The algorithm will not be used in real-world scenarios in the near future. Yet, if it had, it would have had lower performance analyzing data from Haiti than data New England. This disparity was not intentional, yet it could have cost lives in delivering crucial disaster relief.

Biases are almost always a product of humans. They can originate at every step of the creation of an NLP algorithm, from the selection of data, the preparation of data, the creation of code, and the training of the algorithm. Whether intentional or not, biased missteps create algorithms with marginalizing biases toward certain demographics that will manifest during use. As NLP develops, and algorithms become increasingly capable of generating human language, the notion of a biased algorithm generating racist, gendered, or demeaning language during use, or harboring a bias towards a demographic—without the intention of the computer scientist—is a sobering thought.

When considering how to solve the issue of bias in NLP, there is a major caveat. It is impossible to claim and prove that each and every bias is malign and must be eliminated. Additionally, it is impossible to eliminate every bias from existence; there is a typically entrenched force, institution, or structure producing that bias.

In large, biases are largely malign and in practice have contributed to more harm that good. Thus, computer scientists should take extreme care and caution when inevitably producing A.I. We have outlined a general guideline for computer scientists to follow when assessing or encountering bias during any stage of the algorithm development cycle.

Guidelines:

1. Assess possible biases that could occur in the dataset and why 2. Determine how malign the possible biases are on a scale of 1-5.

a. Scale. For clarification, being “completely” benign/malign means that the bias is inherently benign/malign and has a benign/malign impact on algorithm performance. The “3” category is for biases that are not overtly benign or malign but have an impact that skews the dataset in a distinct way.

i. 1 is completely benign ii. 2 is benign impact on performance

iii. 3 is skews dataset with tangible impact iv. 4 is malign impact on performance v. 5 is completely malign

b. Evaluation Metrics. When giving a number on the scale to a bias, one should look at the bias’s

i. Inherent malign/benign standing (reason for existence) ii. Impact on algorithm performance

iii. Real-world Impact 3. Determine the desirability of minimizing biases. There exists a scenario in which a certain bias

is arguably necessary or should not be eliminated. Given how biases have negatively manifested during A.I. use, this scenario should be rare, and biases should be minimized. However, the scenario exists as a possibility, and thus this steps evaluates if the bias should be removed.

4. Minimize the bias. Each bias should be brought down to a “1” or “2” scale, being completely benign or having a benign impact on performance, thus eliminating any malignant impact the bias might have

Facial Recognition

Facial recognition is the process in which a computer automatically identifies or verifies people’s identities from images of their face. Using modern artificial intelligence techniques, today's facial recognition systems are better and more useful than ever before. Popular examples of facial recognition include auto-tagging people on social media and photo sharing platforms, unlocking smartphones with Face ID, taking selfies with Snapchat filters, and advertising intelligently. However, due to the sheer diversity in human faces, facial recognition is not perfect. In fact, facial recognition bias is an issue with serious repercussions in racial and gender equality.

Today, some police departments are using Amazon Rekognition, a popular facial recognition tool, to identify and arrest criminals. Amazon Rekognition, when tested by researchers on a dataset of European and African parliamentarians (roughly half female and half male), identified lighter males 100% of the time, darker males 98.7% of the time, lighter females 92.9% of the time, and darker females 68.6% of the time. It also mistook women with light skin for men 19% of the time and darker women for men 31% of the time (Buolamwini). Essentially, the algorithm worked best on white men and worst on black women. Additionally, when the American Civil Liberties Union (ACLU) used Rekognition to compare criminal mugshots to US Congressmen, it falsely matched 28 US Congress members to existing criminals. Nearly 40% of these misidentifications involved people of color (Wong).

These worrisome results pose serious questions about the implications of facial recognition bias in criminal justice. According to researchers at Georgetown Law School, one in two American adults is in a facial recognition network used by law enforcement. This means that people of color and especially dark-skinned females, who are statistically arrested more often and thus disproportionately represented in these networks, are at a higher risk of being misidentified and imprisoned for crimes they didn’t commit by software trained using data from the networks. As a result, the perpetrators are more likely to get away and continue to commit crimes. Incarcerating an innocent who is not adequately proven guilty is not just unethical but also unconstitutional, as it goes against the American justice system’s principles.

Furthermore, facial recognition bias can cause people, especially teens, to feel excluded and discriminated against. In 2015, Google’s Photo app identified two black people as gorillas, which caused controversy and backlash in communities of color (Nieva). When such apps mislabel or don’t recognize darker faces when they can almost always recognize and correctly identify lighter ones, they create a divide and cause dark-skinned people to feel excluded from data-induced societal “norms.” Additionally, because these users perceive themselves as victims of discrimination, they can potentially sue the companies creating these apps, inciting conflict on both sides.

Photo-based social media apps are just one area where facial recognition bias can have a widespread negative impact on teens. Another area is in attendance tracking. Some schools’ assembly attendance systems have room for a lot of errors; attendance is taken by teachers stationed around the assembly hall while an overflow of students try to exit. Attendance systems can be far more efficient if they employ facial recognition to identify and check-in students. Just a couple weeks ago, my best friend received an absence for an assembly that she did in fact attend, and she had to go through an uncomfortable, stressful process to summarize the assembly and convince the administrators that she was present. My best friend is black. If our school used facial recognition technology to track attendance and the program did not identify her, she would be really offended and feel excluded.

So, what is the root cause of facial recognition bias? Where did it all come from, and how can it be removed? The answer is simple. What goes in comes out; the datasets that are used to train facial

recognition algorithms aren’t diverse enough. In order to increase facial recognition accuracy, neural networks need to run on hundreds of thousands of faces with different skin tones, biological sexes, and facial structures. This is one part of the issue. The other part is that some facial recognition software developers

train and test their algorithms on their development team and their friends. Google’s gender and race

statistics for its technical workforce shown above is proof of this. We know that there is a lack of diversity in Artificial Intelligence careers, so there is no surprise in that these teams have accurate results when testing within themselves and other similar-looking individuals. This is yet another reason why more diversity in AI development is crucial.

Facial recognition is an undeniably powerful tool with many incredible uses. It helped police identify 3,000 missing children in only four days in India (Wong, CNET). Facebook uses it to identify people in photos for those who are visually impaired. Smartphones use it to unlock. But as is the case with any powerful tool, there are also negative consequences that result when facial recognition is not built and regulated well. Even if bias is fixed, more issues come into question. If every face can be identified with near-perfect accuracy, what is to stop groups from abusing this power? The Chinese government is currently using facial recognition technology to identify and detain Uighur Muslims (Johnson). How can suppressive surveillance like this be avoided? Here is some food for thought: should we even allow facial recognition bias to be fixed, in order to protect minorities and groups that may be targeted? Otherwise, how do we plan to regulate the use of this technology? Facial recognition bias is complex when looking at the grander scheme of things. Whatever happens, one thing is for sure: regulations and reforms to facial recognition must be considered immediately, and researchers as well as the public need to be thinking more actively about bias and regulation.

Computer Vision

Computer Vision is a subfield of Artificial Intelligence which aims to develop techniques that help the computer understand content in images and videos in order to make sense of what they “see”. The main goal of computer vision is to “teach machines to see just like [humans] do; naming objects, identifying people, inferring 3D geometry of things, understanding relations, emotions, actions, and intentions,” according to Dr. Fei-Fei Li, Co-Director of Stanford University's Human-Centered AI Institute, Co-Director of the Stanford Vision and Learning Lab, and leading scientist of ImageNet. Instead of focusing on improving vision algorithms, she had the insight to give the same algorithms immense sets of training data similar to those a child acquires through years of experience. This was a major turning point in the field of Computer Vision. ImageNet has become an image database with

over 15 billion images in over 22,000 categories. Almost 50,000 workers from over 115 countries have contributed to this database. ImageNet provides instant access to thousands of images needed to train algorithms. The potential of computer vision is endless and already has impactful applications.

Today, computer vision is applied in almost every industry, ranging from automobiles to retail to healthcare. The most famous application of computer vision in the automotive industry is Tesla’s autopilot system. From simply lane-centering and self-parking, the system has now advanced to entirely support self-driving cars. These improvements are possible due to the vast data sets that have been generated to train, validate, and test algorithms in parallel to autonomous vehicles.

Computer vision has also been exercised in the retail industry, both in person-to-person sales and in e-commerce. Amazon Go opened its doors in January of 2019 —“a new checkout-free grocery and convenience store.” They refer to their technology as “Just Walk Out,” where consumers can pick up everything they need and simply walk out of the store. The receipt is then sent straight to the consumer’s Amazon account. Amazon has developed cameras and sensors that track individuals around the store, including how

many of each item they are purchasing. The more times the consumer goes to the store, the more information the computer learns about the user’s shopping habits and history. Although in-person retail vision technology is not bulletproof, it is definitely a major step in the right direction. On the other side of the retail spectrum, e-commerce companies like Asos are adding visual search features to their user interface in order to not only deliver better customer experiences but also to increase revenue. Consumers find it easier to purchase what they see on the internet, actually increasing the average order by 20%. This technology also increases business, because it offers products that are visually similar to those that are out of stock. Visual search features make the backend of e-commerce more straight-forward with automatic product tagging, stock level tracking, and best selling item lists in several categories. Despite computer vision’s many successful applications, it can result in detrimental predictions and statistics if trained incorrectly. The bias in computer vision is a major obstacle for the implementation and usage of these applications, because the main benefit of computer vision is its high accuracy; this accuracy is negated if bias is prominent or even present. Computer vision is susceptible to bias for two major reasons: sufficient data and interpretation of results. In data, there can be selection bias, out-group homogeneity bias, biased data representation, or biased labels. In terms of interpretation, there can be overgeneralization or overfitting as well as correlation fallacy. Selection bias occurs when the selection or data set that the algorithm is trained on doesn’t contain a random sample. One example of this is in identifying the wealth of various regions via satellite imagery. Satellite image selection bias is a prominent issue in such computer vision models. Around 80% of satellite images from Google Maps are not of poor regions; as a result, they do not reflect a random sample. When our model had 84.38% accuracy using a convolutional neural network, it was essentially just guessing. Out-group homogeneity bias is the tendency to see out-group members as more alike than in-group members. In other words, although the data set may correctly represent the distribution of in-group members, out-group members are grouped together. For example, a data set may contain several pictures of different breeds of dogs while only including photos of one breed of cats; this is not a

correct representation of all the types of cats. This leads to biased data representation, where some groups are represented less positively than others, such as how there are fewer images of black cats, so black cats are less likely to even be recognized as cats. Biased labels also induce incorrect identification. For instance, a man wearing a black suit and a woman wearing a white veil and dress will be labeled as a wedding photo, but a Native American man and woman wearing traditional clothing while performing a wedding ritual may only be labeled as people. Overgeneralization is an interpretation bias, where results are formed based on information that is too general. This includes situations such as dogs being labeled as cats because all cats have four legs. Overfitting is a similar problem in which the model is trained very well to the data set but has low accuracy when it predicts data outside the training set, generalizing poory. Correlation Fallacy confuses correlation and causation, also leading to bias. For example, women began wearing jeans in the 1930s, and World War II began leading into the ‘40s. Confusing correlation with causation here would declare that women wearing jeans caused World War II. Computer Vision has already made its mark in industries from healthcare to retail to autonomous vehicles, and its reach continues to increase. To ensure that we use this novel technology for social good, we must remain cautious of bias in data by ensuring that we avoid selection bias, out-group homogeneity bias, biased data representation, and biased labels, and bias in interpretation by avoiding overgeneralization, overfitting, and correlation fallacy.

Data Collection

Artificially intelligent systems draw conclusions solely based on the data they are fed. The

more training data a system has, the better predictions it can make. Through recent research and industry efforts, such as the ImageNet project, the amount of data that is accessible and can be stored has increased an incredible amount. However, the process of obtaining large quantities of data can be ethically questionable. The large debate regarding data collection centers around user privacy. The boundaries of how much access a company should have to personal data and who they’re allowed to share it with vary from firm to firm. Often, companies will share their data with third party organizations, and the people who the data actually belongs to have no idea or control over these transportations. Those in favor of big data mining argue that at least some privacy has to be sacrificed in order to obtain data sets large enough to do accurate analyses on. However, if a deep learning algorithm is given too much data about one user in particular, it could potentially start predicting the actions of that user, encroaching on their privacy. Despite helping companies and artificially intelligent algorithms that rely on big data, data mining is a concern to user privacy.

The issue of data security is at the forefront of technology ethics headlines today, especially when discussing voice-activated assistants, like Google Home and Amazon’s Alexa. Those devices constantly record audio and use Natural Language Processing (NLP) to tell if they should respond to the user. There were several reported cases of Alexa glitching and using other users’ data to respond to

requests. Personal data was mistakenly shared with other Alexas, and the privacy of the users was compromised. Another well-known example of privacy being compromised in data sharing is the Cambridge Analytica (CA) scandal that happened during the 2016 U.S. elections. Facebook agreed to share data with the England-based company Cambridge Analytica, who then developed a quiz that allowed them access to all of the quiz takers’ friends’ account data. Due to the way the quiz was structured, CA was able to gather user data without asking their direct consent. This method of data collection was against Facebook’s rules, but CA collected and sold the data anyway. People were outraged at the blatant disrespect for

users’ privacy, and this scandal helped increase public focus on privacy protection laws. Finding a solution for this is exceptionally difficult, because in order for a company to use and

make a profit off of user data, security has to be compromised to some extent. Unfortunately, while our modern systems can operate with large sets of data, they are not yet reliable in keeping that data secret. Many studies show that the American public does not trust corporations and the government to protect their data; this could be harmful to the development of new technologies. As a result of constant problems that crop up with commercial technologies, governments have been pressured into putting legislation into place that helps protect users’ data privacy. At a federal level in the U.S., the

question falls under the jurisdiction of the Federal Trade Commission, the agency that enforces laws that deal with interstate commerce. The FTC has deemed that any company that doesn’t uphold their privacy policy is taking part in “unfair and deceptive practices,” and therefore, can be penalized. Apart from this, there is no one policy that is regularly enforced to deal with this issue. Instead, the data collection policies are sector-specific, so the data in the healthcare is protected differently than the data in, for example, the finance sector. The definitions in these laws are often very broad, so most people don’t know which data falls under which category. The patchy legislation sometimes ends up being contradictory and

incompatible between sectors. State laws have been passed specifically in response to data breaches, such as data breach notification laws. In addition to uneven protection from the law, the enforcement of these laws isn’t perfect either. Enforcement is difficult, because unpredictable data breaches can occur and compromise security. Large companies also gain power as they grow, so limiting them becomes even harder.

Ethical data collection is one of the biggest challenges in AI. Large tech companies like Google and Amazon rely on data collection to run advertisements, a primary source of revenue. Companies sometimes choose to employ potentially unethical methods to collect data, causing a multitude of problems. Users lose control over their privacy once their data is collected, and companies have the ability able to use and sell the data. Government regulation in the U.S. has been put into place to try to protect users and their data, but conflicting laws and minimal enforcement make it very complicated to endure user safety. Improved data protection can only come with better laws, so political action should be taken to make sure people and the data they generate are properly secured, used, and handled.

Gun Violence America is plagued with a gun violence epidemic. The Gun Violence Archive reports that in

the year 2019, there have been 399 mass shootings. In all cases of gun violence, there have been more than 38 thousand deaths, including more than 200 children (aged 0-11) and more than 750 teenagers (aged 12-17). Adding onto that, more than 2,600 children and teenagers have been injured by gun violence (Gun Violence). The news of mass shootings panics and frightens the country, yet many Americans simply send “thoughts and prayers,” relying on hopeful words instead of improving policy.

The United States President Donald Trump has blamed mental illness for the gun violence epidemic, even though studies have shown that mental health is not a major cause of violence. Arthur C. Evans Jr., PhD, the CEO of the American Psychological Association, has even stated that “blaming mental illness… is simplistic and inaccurate and goes against the scientific evidence currently available” (Mills). President Trump has ignored the accessibility of firearms, and he instead stigmatizes mental illness by not stating specifically that mentally ill individuals with violent tendencies are to blame, not the vast majority of individuals with mental illness who have no intention of hurting others (Ranney and Gold). Placing trust in an unproven belief that mental illness is a leading factor of gun violence, President Trump has proposed an artificial intelligence (AI) system to be potentially installed in different personal electronic devices to identify mental illnesses that may lead to violence (Conley). However, using an AI system to identify and target mentally ill individuals creates many biases and privacy concerns, including the violation of privacy against a stigmatized group of individuals who should not be generalized and associated with violence. So, without potentially installing intrusive AI systems in personal devices, how else can the gun violence epidemic be solved?

One approach to decreasing the epidemic of gun violence in America would be to utilize social media and AI as tools to predict potential acts of gun violence, not by targeting specific groups of individuals. With the billions of social media users today, social media mining has become a useful

method for advertisers and marketers. The common user experience of seeing a product advertised on the side of a social media page, such as Facebook, shortly after looking it up is an example of this. The advertisement came from the mining of user information by social media companies; “social media mining occurs when a company or organization collects data about social media users and analyzes it in an effort to draw conclusions about the populations of these users” (McCourt).

While social media mining results are often used for marketing campaigns, this approach could be used to identify potential gun violence threats, particularly in high schools or colleges, where many students are on a variety of social media platforms. For example, a social media user may be looking up recent events of gun violence such as a school shooting, then commenting on a social media platform in a concerning way. Following that, they may search for people like the Parkland High School shooter and “like” pages that glorify them. The individual may then venture onto and “like” hate group profiles or sites. They may continue to purchase a gun and brag about it on a social media platform. This pattern of dangerous actions can be traced through social media. While this method of surveillance cannot be used as a definite way to identify a future violent individual, social media mining may help alert authorities to investigate a suspicious individual that shows concerning patterns on social media.

Social listening is also another approach that may help identify suspicious individuals. While social monitoring identifies the topics people are searching, social listening tracks specific phrases and words posted by certain individuals (Surico). These tracked phrases and words could potentially help flag a person who may show signs of violence. Social listening is a very popular tool for marketers but could easily be used to look for key words used by individuals who make threatening statements on the internet.

However, the interpretation of data can reflect existing prejudices. If a programmer has a certain bias against or for certain individuals and they work by themselves on a program, they may implement biases that work for or against certain groups of people. This is illustrated through the stigma against mentally ill individuals that President Trump and many others have introduced. Despite studies that prove that the vast majority of these individuals are nonviolent, mentally ill individuals have become the scapegoats to blame for the gun violence epidemic (Ranney and Gold). The tracking devices that President Trump has supported show blatant, unfounded bias against a group of people. Bias can also be introduced during data preparation and when social contexts aren’t considered. For example, an algorithm that predicts gun violence in high school should be different from one that predicts gun violence in a senior home or homeless shelter.

Another issue to consider is how law enforcement uses all the gathered data from social media. Even though law enforcement uses social media to find criminal activity, evidence, location, identification, and publicization, “[a]ccording to a 2014 LexisNexis online survey, eighty percent of federal, state, and local law enforcement professionals use social media platforms as an intelligence gathering tool, but most lack policies governing the use of social media for investigations” (Mateescu et al). Even though law enforcement uses social media, many agencies do not follow specific policies that may support privacy rights and free speech rights. Because of this, many Americans argue over whether this use of social media impedes First Amendment and Fourth Amendment rights.

While many law enforcement officers manually search through information and have relied on it in the past, many agencies are looking to focus on technology to aid with this surveillance (Mateescu et al.). However, surveillance systems, such as facial recognition systems, are not strong in their detection accuracy, because they appeal to and recognize more people with lighter skin tones as compared to people with darker skin tones. If a gun violence prediction system is adopted, agencies would have to ensure that the system is not

biased, because that would unfairly focus on demographics and physical features rather than the potential for and threats of violence. If image recognition aids in the prevention of gun violence, creators of the algorithm must eliminate any possible recognition of humans and instead focus on actual threats of violence. Language may also interfere with the accuracy of the system to prevent gun violence, as exposures to different experiences may change how a social media user communicates. In addition, anyone who has access to the mined data needs to be carefully protected and monitored so that the private information is secure.

How can social media analyzers separate someone having a bad day from someone posting

violent threats? While some mass shooters have posted obvious threats and manifestos on social media, threats may be veiled. If the meaning of a possible, undecipherable “threat” is unknown, is assuming the poster will act on their threats an invasion of privacy or an assumption for the greater good?

One current application that is used to monitor social media is called “Mention.” The website for Mention states that the program “enables brands & agencies to monitor the web, listen to their audience and manage social media” (Mention). The application explores social media—such as Facebook, Twitter, or Instagram—and identifies every time someone talks about a user’s brand, product, or a related topic. Mention tracks and reports these insights and alerts the user when relevant conversations occur. If this method is applied to gun violence prediction, this tracking could be adapted to find phrases that have been identified by law enforcement and psychiatrists to detect individuals who may commit violent acts.

Many national leaders are not taking progressive steps to prevent gun violence, so students and scientists alike are investigating solutions to predict such occurrences of violence. Shreya Nallapati, a

freshman student at the University of Denver, began #NeverAgainTech when she was 17 years old after the Parkland High School tragedy (Fultonberg). She and other women involved in the group are working together to create an artificial intelligence application that can analyze the wide range of data associated with school shootings, including information about past shooters and other data related to the shootings. #NeverAgainTech has also designed an

application to help treat PTSD symptoms of gun violence survivors. Furthermore, scientists from the Cincinnati

Children’s Hospital Medical Center have found that machine learning is able to determine risk for school violence (Barzman). This approach examines early warning signs of potential violence, and the researchers found that artificial intelligence can be as accurate as a team of child and adolescent psychiatrists in determining risk for school violence. Although the scientists have not found whether

the algorithm can actually prevent violence, they have accurately identified the potential of students to be violent.

Dr. Desmond Patton, the Associate Professor of Social Work at Columbia University and the director of SAFElab, developed a natural language processing tool for detecting aggression and grief in social media (McCullom). He is attempting to show how social media can be used to intervene in youth and gang violence. A similar approach to the algorithm created by Dr. Patton could be used to identify individuals that express a desire to engage in gun violence. Such threatening behavior could be found through the individual’s use of certain language on various social media platforms, such as Twitter, Facebook, or Instagram. Certain words and word combinations or phrases could be searched for within different posts, and any findings can indicate a potential violent event.

While gun violence prediction and prevention techniques are nowhere near perfect, scientists and students are striving to put an end to gun violence through predictive methods. In December 2019, Congress agreed to spend $25 million on gun violence research, which has not received funding in more than 20 years (Schumaker). This funding was split equally between the Centers for Disease Control and Prevention and the National Institutes for Health, and the funding bill recommends that both groups study the causes and prevention of gun violence. However, Dr. Mark Rosenberg, former leader of the CDC’s National Center for Injury Prevention and Control, believes that the $25 million is not enough for gun violence research.

Future success in gun violence research lies in some fundamental questions. Is gun violence research funded enough? How much of their privacy are Americans willing to forfeit in order to try to stop the gun violence epidemic? How can bias be eliminated without ignoring key attributes of violent individuals? And, most importantly, is artificial intelligence accurate in predicting gun violence when it is used to investigate social media platforms? Until these questions are answered, gun violence will continue to plague America.

Discussion

As AI takes its inevitable place in our lives, the ethical questions it raises become more crucial than ever. The case studies on Natural Language Processing, Facial Recognition, Computer Vision, Data Collection, and Gun Violence address several of the most pressing issues that we’ve identified as nonprofit AI4ALL’s Special Interest Group of high school students around the world.

Our Data Collection case study reveals that government action is an effective way of ensuring both privacy and diversity regarding data collected. The Facial Recognition case study supports this finding by highlighting the need for regulations to provide equal representation in data as the exclusion of minorities from the AI-driven future is simply unacceptable. We cannot afford having the biased AI systems that misidentify different skin colors and genders deployed to make crucial decisions, such as in detecting violent individuals, as the Gun Violence Detection case demonstrates. Determining the type of need for disaster relief at an impacted location is also an instance of AI’s use in life-changing situations, and the Natural Language Processing study suggests a general outline for scientists to detect bias in early stages of development to minimize the bias in such important applications. The Computer Vision case study discusses some of the most significant forms of bias in algorithms, emphasizing the need to take action and eliminate the harmful biases that damage the credibility and effectivity of AI systems, preventing them from reaching their maximum potential to benefit society as a whole.

Today, Artificial Intelligence is referred to as the “new electricity” and the “driver of the third industrial revolution.” To use AI for good and to ensure that it benefits everyone without exacerbating inequalities already existing in society, we need to eliminate bias in AI systems. The most pressing solutions for this are increasing diversity in the AI workforce and thoroughly studying the ethical implications of a system before bringing it to market. As a society, we need to work together to ask and answer why.

Works Cited Arcas, Blaise Aguera y. “Do Algorithms Reveal Sexual Orientation or Just Expose Our Stereotypes?”

Medium, Medium, 18 Jan. 2018, medium.com/@blaisea/do-algorithms-reveal-sexual-orientation-or-just-expose-our-stereotypes-d998fafdf477.

Arren Alexander. “Computer Vision Case Study: Amazon Go.” Medium, Arren Alexander, 2 Apr.

2018, medium.com/arren-alexander/computer-vision-case-study-amazon-go-db2c9450ad18. Arren Alexander. “Computer Vision Case Study: Amazon Go.” Medium, Arren Alexander, 2 Apr.

2018, medium.com/arren-alexander/computer-vision-case-study-amazon-go-db2c9450ad18. Barzman, Drew. "Pilot Study Validates Artificial Intelligence to Help Predict School Violence."

Cincinnati Children's, Cincinnati Children's Hospital Medical Center, 3 May 2018, www.cincinnatichildrens.org/news/release/2018/predicting-school-violence. Accessed 22 Dec. 2019.

Bielostotzky, Daniel Martinez. “Bias-Variance Tradeoff: Overfitting and Underfitting.” Medium,

Medium, 2 Oct. 2018, medium.com/@martinezbielosdaniel/bias-variance-tradeoff-overfitting-and-underfitting-c63799cb4851.

Buolamwini, Joy. “Response: Racial and Gender Bias in Amazon Rekognition - Commercial AI System

for Analyzing Faces.” Medium, Medium, 24 Apr. 2019, medium.com/@Joy.Buolamwini/response-racial-and-gender-bias-in-amazon-rekognition-commercial-ai-system-for-analyzing-faces-a289222eeced.

Buolamwini, May. Images and the percent error for different genders and races when AI systems were

told to classify the pictures as male or female. towardsdatascience.com/https-medium-com-mauriziosantamicone-is-artificial-intelligence-racist-66ea8f67c7de. (4)

Cadwalladr, Carole, and Emma Graham-Harrison. "Revealed: 50 million Facebook profiles harvested

for Cambridge Analytica in major data breach." The Guardian, Guardian News & Media, 17 Mar. 2018, www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-us-election. Accessed 22 Dec. 2019.

Chang, Alvin. “The Facebook and Cambridge Analytica Scandal, Explained with a Simple Diagram.”

Vox, Vox, 2 May 2018, www.vox.com/policy-and-politics/2018/3/23/17151916/facebook-cambridge-analytica-trump-diagram.

http://www.vox.com/policy-and-politics/2018/3/23/17151916/facebook-cambridge-analytica-trump-diagram

http://www.vox.com/policy-and-politics/2018/3/23/17151916/facebook-cambridge-analytica-trump-diagram

Conley, Julia. "Trump Mulling 'Uniquely Dystopian' Proposal to Use AI to Identify Mental Health

Issues as Risk Factors for Gun Violence." Common Dreams, 23 Aug. 2019, www.commondreams.org/news/2019/08/23/trump-mulling-uniquely-dystopian-proposal-use-ai-identify-mental-health-issues-risk. Accessed 22 Dec. 2019.

Fultonberg, Lorne. "With AI, DU Freshman Looks to End Mass Shootings." University of Denver, 17

Apr. 2019, www.du.edu/news/ai-du-freshman-looks-end-mass-shootings. Accessed 22 Dec. 2019.

Garbade, Michael J. “A Simple Introduction to Natural Language Processing.” Medium, Becoming

Human: Artificial Intelligence Magazine, 15 Oct. 2018, becominghuman.ai/a-simple-introduction-to-natural-language-processing-ea66a1747b32.

Grojean, Andrew. "GDPR, AI and Machine Learning in the Age of Data Privacy." Intouch Solutions, 6

Dec. 2018, www.intouchsol.com/wp-content/uploads/Blog/PDFs/IntouchPOV_GDPRAIandMachineLearningintheAgeofDataPrivacy.pdf.

Gun Violence Archive 2019. Gun Violence Archive, 21 Dec. 2019, www.gunviolencearchive.org/.

Accessed 21 Dec. 2019. Henry J. Kaiser Family Foundation. "Gun Violence Makes U.S. an Outlier, Not Mental Illness." 9

Aug. 2019, www.kff.org/other/slide/gun-violence-makes-u-s-an-outlier-not-mental-illness/. (1)

Johnson, Khari. “AI Ethics Is All about Power.” VentureBeat, VentureBeat, 15 Nov. 2019,

venturebeat.com/2019/11/11/ai-ethics-is-all-about-power/. Lesiv, et al. “Characterizing the Spatial and Temporal Availability of Very High Resolution Satellite

Imagery in Google Earth and Microsoft Bing Maps as a Source of Reference Data.” MDPI, Multidisciplinary Digital Publishing Institute, 11 Oct. 2018, www.mdpi.com/2073-445X/7/4/118/htm.

Li, Fei-Fei. “How We're Teaching Computers to Understand Pictures.” TED,

www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_to_understand_pictures?language=en.

Lynskey, Dorian. "'Alexa, Are You Invading My Privacy?' – the Dark Side of Our Voice Assistants."

Th Guardian, Guardian News & Media, 9 Oct. 2019,

http://www.intouchsol.com/wp-content/uploads/Blog/PDFs/IntouchPOV_GDPRAIandMachineLearningintheAgeofDataPrivacy.pdf

http://www.intouchsol.com/wp-content/uploads/Blog/PDFs/IntouchPOV_GDPRAIandMachineLearningintheAgeofDataPrivacy.pdf

http://www.mdpi.com/2073-445X/7/4/118/htm

http://www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_to_understand_pictures?language=en

http://www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_to_understand_pictures?language=en

www.theguardian.com/technology/2019/oct/09/alexa-are-you-invading-my-privacy-the-dark-side-of-our-voice-assistants.

Mateescu, Alexandra, et al. "Data & Civil Rights: Social Media Surveillance and Law Enforcement."

Data & Society, 27 Oct. 2019, datasociety.net/output/data-civil-rights-social-media-surveillance-and-law-enforcement/. Accessed 22 Dec. 2019.

McCourt, Abby. "Social Media Mining: The Effects of Big Data In the Age of Social Media." Yale

Law School, 23 Apr. 2019, law.yale.edu/mfia/case-disclosed/social-media-mining-effects-big-data-age-social-media. Accessed 22 Dec. 2019.

McCullom, Rod. "A Murdered Teen, Two Million Tweets and an Experiment to Fight Gun

Violence." Nature, Springer Nature, 4 Sept. 2018, www.nature.com/articles/d41586-018-06169-8. Accessed 22 Dec. 2019.

“Memorizing Is Not Learning! - 6 Tricks to Prevent Overfitting in Machine Learning.” By,

hackernoon.com/memorizing-is-not-learning-6-tricks-to-prevent-overfitting-in-machine-learning-820b091dc42.

Mention. mention.com/en/. Accessed 22 Dec. 2019. Mills, Kim I. "Statement of APA CEO on Gun Violence and Mental Health." American Psychological

Association, 5 Aug. 2019, www.apa.org/news/press/releases/2019/08/gun-violence-mental-health. Accessed 21 Dec. 2019.

Nieva, Richard. “Google Apologizes for Algorithm Mistakenly Calling Black People 'Gorillas'.”

CNET, CNET, 2 July 2015, www.cnet.com/news/google-apologizes-for-algorithm-mistakenly-calling-black-people-gorillas/.

NiSSI. “Social Networking Strengthens Relationships.” NiSSI, 20 Apr. 2016,

nissiphilippines.wordpress.com/2015/11/04/social-networking-strengthens-relationship/. N-gram networks are used to find connotations of phrases and words. The blue bars anticipate a

positive connotation when two words are placed together, while red bars anticipate a negative connotation. uc-r.github.io/word_relationships#ngram. (2)

http://www.theguardian.com/technology/2019/oct/09/alexa-are-you-invading-my-privacy-the-dark-side-of-our-voice-assistants

http://www.theguardian.com/technology/2019/oct/09/alexa-are-you-invading-my-privacy-the-dark-side-of-our-voice-assistants

http://www.cnet.com/news/google-apologizes-for-algorithm-mistakenly-calling-black-people-gorillas/

http://www.cnet.com/news/google-apologizes-for-algorithm-mistakenly-calling-black-people-gorillas/

Peregud, Irina. “How Visual Search Engines Are Disrupting the Retail Industry.” InData Labs, 2 Aug. 2019, indatalabs.com/blog/visual-search-disrupting-retail-industry.

Peregud, Irina. “The Most Exciting Applications of Computer Vision across Industries.” InData Labs,

31 July 2019, indatalabs.com/blog/applications-computer-vision-across-industries?cli_action=1577141109.918.

Pew Research Center. Majority of Americans feel as if they have little control over data collected about them by companies and the government. 15 Nov. 2019, www.pewresearch.org/internet/2019/11/15/americans-and-privacy-concerned-confused-and-feeling-lack-of-control-over-their-personal-information/. (3)

Ranney, Megan L., and Jessica Gold. "The Dangers of Linking Gun Violence and Mental Illness."

Time, 7 Aug. 2019, time.com/5645747/gun-violence-mental-illness/. Accessed 22 Dec. 2019. "Reforming the U.S. Approach to Data Protection and Privacy." Council on Foreign Relations, 30 Jan.

2018, www.cfr.org/report/reforming-us-approach-data-protection. "Rethinking Privacy for the AI Era." Forbes, Forbes Media, 27 Mar. 2019,

www.forbes.com/sites/insights-intelai/2019/05/22/welcome-from-forbes-to-a-special-exploration-of-ai-issue-6/#241247f74650. Accessed 21 Dec. 2019.

Schumaker, Erin. "Congress Agrees on Historic Deal to Fund $25 Million in Gun Violence Research."

ABC News, ABC News Internet Ventures, 16 Dec. 2019, abcnews.go.com/Health/congress-approves-unprecedented-25-million-gun-violence-research/story?id=67762555. Accessed 22 Dec. 2019.

Shreya Nallapati speaks at the 2019 Global Teen Leader Conference in New York. 17 Apr. 2019,

www.du.edu/news/ai-du-freshman-looks-end-mass-shootings. Smith, Aaron. "Americans and Cybersecurity." Pew Research Center, 26 Jan. 2017,

www.pewresearch.org/internet/2017/01/26/americans-and-cybersecurity/. Surico, Kimberly. "Social Listening: What, Why & How." Netbase, Netbase Solutions, 15 July 2019,

www.netbase.com/blog/what-is-social-listening-why-is-it-important/. Accessed 22 Dec. 2019. Synced, and Machine Intelligence | Technology & Industry | Information & Analysis. “SenseTime

Trains ImageNet/AlexNet In Record 1.5 Minutes.” Synced, 22 Feb. 2019, syncedreview.com/2019/02/25/sensetime-trains-imagenet-alexnet-in-record-1-5-minutes/.

http://www.pewresearch.org/internet/2017/01/26/americans-and-cybersecurity/

Tch, Andrew. “The Mostly Complete Chart of Neural Networks, Explained.” Medium, Towards Data Science, 4 Aug. 2017, towardsdatascience.com/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464.

“US Federal Trade Commissionvector Logo.”Worldvectorlogo,

worldvectorlogo.com/logo/us-federal-trade-commission. Wong, Queenie. “Why Facial Recognition's Racial Bias Problem Is so Hard to Crack.”

CNET, www.cnet.com/news/why-facial-recognitions-racial-bias-problem-is-so-hard-to-crack/.

Documents

Ethics in Artificial Intelligence: the W hyai-4-all.org/wp-content/uploads/2020/01/Ethics-in-AI... · 2020-01-29 · Ethics in Artificial Intelligence: the W hy Case Study Analyses