During the last decade, traditional data-driven deep learning (DL) has shown remarkable success in essential natural language processing tasks, such as relation extraction. Yet, challenges remain in developing artificial intelligence (AI) methods in real-world cases that require explainability through human interpretable and traceable outcomes. The scarcity of labeled data for downstream supervised tasks and entangled embeddings produced as an outcome of self-supervised pre-training objectives also hinders interpretability and explainability. Additionally, data labeling in multiple unstructured domains, particularly healthcare and education, is computationally expensive as it requires a pool of human expertise. Consider Education Technology, where AI systems fall along a “capability spectrum” depending on how extensively they exploit various resources, such as academic content, granularity in student engagement, academic domain experts, and knowledge bases to identify concepts that would help achieve knowledge mastery for student goals. Likewise, the task of assessing human health using online conversations raises challenges for current statistical DL methods through evolving cultural and context-specific discussions. Hence, developing strategies that merge AI with stratified knowledge to identify concepts that would delineate healthcare conversations patterns and help healthcare professionals decide. Such technological innovations are imperative as they provide consistency and explainability in outcomes. This tutorial discusses the notion of explainability and interpretability through the use of knowledge graphs in (1) Healthcare on the Web, (2) Education Technology. This tutorial will provide details of knowledge-infused learning algorithms and its contribution to explainability for the above two applications that can be applied to any other domain using knowledge graphs.
For more information, visit https://aiisc.ai/xaikg/
Recent advances in statistical and data-driven deep learning demonstrate significant success in natural language understanding without using prior knowledge, especially in structured and generic domains, where data is abundant. On the other hand, in text processing problems that are dynamic and impact the society at large, existing data-dependent, state-of-the-art deep learning methods remain vulnerable to veracity considerations and especially, high volume that masks small, emergent signals. Statistical natural language processing methods have shown poor performance in capturing: (1) Human well being online especially in evolving events (e.g. mental health communications on Reddit, Twitter), (2) Culture and context specific discussion on the web (e.g. humor detection, extremism on social media), (3) Social Network Analysis (help-seeker and care-provider) during pandemic or disaster scenarios, and (4) Explainable methods of learning that drive technological innovations and inventions for community betterment. In such social hypertext, leveraging the semantic-web concept of knowledge graphs is a promising approach to the enhancement of deep learning and natural language processing.
According to Piagetian human learning theory, the activation of existing schema guides the apprehension of experience to support the generation of context sensitive responses. Activating prior knowledge connects current and past experience for identifying relations, supporting explanation, reducing ambiguity, structuring new knowledge, and application to novel materials. Further, human learning does not necessarily rely on large amounts of (annotated) cases to proceed. Because prior knowledge is so powerful in human learning, its incorporation at various levels of abstraction in deep learning could benefit outcomes. Example the desiderata include compensating for data limitations, improving inductive bias, generating explainable outcomes and enabling trust. These are particularly useful for data-limited but otherwise complex, evolving problems in domains such as mental healthcare, online social threats and epidemic/pandemic.
Despite the general agreement that structured prior knowledge and tacit knowledge (the inferred outcome of a model) resulting from deep learning should be combined, there has been little progress. Recent debates on Neuro-Symbolic AI , the inclusion of innate priors in deep learning, and AI fireside chat have identified knowledge-infused learning to improve explainability, interpretability, and trust in AI systems.
In this tutorial, we take use cases from the aforementioned two social good applications (Mental Health, Radicalization) and multimodal aspects of social media (e.g. scene understanding from images, video and text (hypermedia/hypertext) often found in documentation of critical events to explore the modern aspect of hypertext using semantic web in the form of Knowledge Graphs (KG). Specifically, the tutorial will provide a detailed walkthrough on Knowledge Graphs and their utility in developing knowledge-infusion techniques for interpretable and explainable learning for text, video, images, and graphical data on the web with the following agenda: Motivate the novel paradigm of knowledge-infused learning using computational learning and cognitive theories. Describe the different forms of knowledge, methods of automatic modeling of KG, and infusion methods in deep/machine learning. Discuss application-specific evaluation methods specifically for explainability and reasoning using benchmark datasets and knowledge-resources that show promise in advancing the capabilities of deep learning. Future directions of KGs and robust learning for the Web and Society.
In today's data-driven world, organizations derive insights from massive amounts of data through large scale statistical machine learning models. However, statistical techniques can be easy to fool with adversarial instances (a neural network can predict a non-extremist as an extremist by mere presence of the word Jihad), which raises question in Data Quality. In high stakes decision making problems, such as cyber social threats, it is highly sensitive to classify a non-extremist as an extremist and vice-versa. Data quality is good if the data possesses adequate domain coverage and the labels contain adequate semantics. For example, is the semantics of an extremist vs. non-extremist vis-a-vis the word Jihad captured in the label (adequate semantics in labels)? Also, are there enough non-extremists with the word Jihad in the training data from the perspective of religion, hate, or ideology? Thus semantic annotation of the data, beyond mere labels attached to data instances, can significantly improve the robustness of model outcomes and ensure that the model has learned from trustworthy, knowledge-guided data standards. It is important to note that the knowledge-guided standards help de-bias the data if specified correctly (contextualized de-biasing extremist behavior data from bias towards the word Jihad). Therefore, in addition to trust in the robustness of outcomes, knowledge guided data creation also enables fair and ethical practices during real-world deployment of machine learning in high stakes decision making. We denote such data as Explainable Data. In this tutorial of type course and case-studies, we detail how to construct Explainable Data using various expert resources and knowledge graphs. All the materials (resources and implementations) presented during the tutorial will be made available on: KIWO-ICWSM, a week before the tutorial. We plan a 90 minute tutorial (Intermediate Level) with 2 breaks (5 mins each).