Gender Discrimination in Data Analysis: a Socio-Technical Approach - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Mémoire D'étudiant Année : 2021

Gender Discrimination in Data Analysis: a Socio-Technical Approach

Résumé

Technology characterizes and facilitates our daily lives, but its pervasive use can result in the introduction or the exacerbation of social problems. Because of their intrinsic complexity, these issues require to be addressed from different but complementary perspectives, which are provided to us by two disciplines of very different nature: data science and sociology. Specifically, this thesis would like to be a bridge between the technical field of data analysis and a specific category of social problems, namely that of discrimination, and, in particular, gender discrimination. To move within this context, we use an approach that has data analysis as its starting point, and which finds in sociology a useful supporting instrument, as well as a source of requirements. We investigate in depth the sociological reasons behind gender discrimination in the specific society of our interest – the American one – introducing and exploring what is commonly referred as ‘gender gap’, and we carry out several experiments on data related to U.S. employees, focusing on the economic perspective (gender pay gap) but taking into account the different other facets of the problem. The main contributions of this thesis derive from the application of preprocessing techniques and the use of tools created with the aim of detecting bias in data, with which we try to understand which design choices have the greatest impact on the so-called ‘fairness’ of the results, and of which we highlight strengths and weaknesses, emphasizing the importance of a multidisciplinary approach to problems of this kind, that is essential to obtain information on the complex context in which data are embedded.
Fichier principal
Vignette du fichier
thesis.pdf (3.32 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03374130 , version 1 (11-10-2021)

Identifiants

  • HAL Id : hal-03374130 , version 1

Citer

Riccardo Corona. Gender Discrimination in Data Analysis: a Socio-Technical Approach. Databases [cs.DB]. 2021. ⟨hal-03374130⟩
68 Consultations
570 Téléchargements

Partager

Gmail Facebook X LinkedIn More