EFFECT OF COGNITIVE BIASES ON HUMAN UNDERSTANDING OF RULE-BASED MACHINE LEARNING MODELS
MetadataShow full item record
This thesis investigates to what extent do cognitive biases a ect human understanding of interpretable machine learning models, in particular of rules discovered from data. Twenty cognitive biases (illusions, e ects) are analysed in detail, including identi cation of possibly e ective debiasing techniques that can be adopted by designers of machine learning algorithms and software. This qualitative research is complemented by multiple experiments aimed to verify, whether, and to what extent, do selected cognitive biases in uence human understanding of actual rule learning results. Two experiments were performed, one focused on eliciting plausibility judgments for pairs of inductively learned rules, second experiment involved replication of the Linda experiment with crowdsourcing and two of its modi cations. Altogether nearly 3.000 human judgments were collected. We obtained empirical evidence for the insensitivity to sample size e ect. There is also limited evidence for the disjunction fallacy, misunderstanding of and , weak evidence e ect and availability heuristic. While there seems no universal approach for eliminating all the identi ed cognitive biases, it follows from our analysis that the e ect of many biases can be ameliorated by making rule-based models more concise. To this end, in the second part of thesis we propose a novel machine learning framework which postprocesses rules on the output of the seminal association rule classi cation algorithm CBA [Liu et al, 1998]. The framework uses original undiscretized numerical attributes to optimize the discovered association rules, re ning the boundaries of literals in the antecedent of the rules produced by CBA. Some rules as well as literals from the rules can consequently be removed, which makes the resulting classi er smaller. Benchmark of our approach on 22 UCI datasets shows average 53% decrease in the total size of the model as measured by the total number of conditions in all rules. Model accuracy remains on the same level as for CBA.
- Theses