Appendix B. Empirical demonstration of deterministic overconfidence
To empirically support the theoretical evidence in Appendix A for deterministic overconfidence, we compared the total entropy from each modeling technique. We observed that there was higher total entropy for MC dropout and deep ensembles with MC dropout, when compared with the deterministic case. This holds for both acceptable and unacceptable sentences. Furthermore, it holds true for the dataset that was generated using adversarial techniques. The following charts show the total entropy comparison.