Abstract:
Training deep neural networks, i.e. deep learning, for computer vision usually requires a vast a amount of training data.
In real world scenarios a vast a amount (atleast 10.000 or up to 1million samples) of relevant training data is not easily avaialable.
Consequently, vanilla deep learning is not applicable for many small data scenarios.
Methods like transfer learning and few-shot learning tackle this challenge by adapting neural networks to a new task with just a few or even just one sample.
The goal of this seminar is to understand and summarize such methods.
Starting Papers
Abstract:
Explainable Artificial Intelligence (XAI) is crucial in the realm of AI, especially as deep neural networks, whose internal mechanisms are often incomprehensible to humans, become more prevalent. Traditional XAI methods, widely implemented, primarily address unimodal problems. However, the landscape is shifting. In recent years, advancements in deep learning (DL) have enabled models to harness multimodal data, integrating diverse information types for more nuanced decision-making. This seminar aims to provide a comprehensive overview of the latest XAI strategies tailored for tackling these complex, multimodal problems.
Starting Papers:
Abstract:
Multimodal Learning in AI focuses on building models that can process and synthesize inputs from diverse data sources. This approach utilizes various fusion methods, each offering distinct benefits. Early Fusion integrates different modalities at the outset, preparing a unified dataset for the AI model. In contrast, Intermediate Fusion, which is central to this seminar, blends modalities within a Deep Learning (DL) network’s intermediate layer, enhancing feature extraction and analysis. Late Fusion, diverging from the previous strategies, combines outputs from distinct models, each trained on different data types, during the final decision-making phase. This seminar will delve into these fusion techniques, with a special emphasis on advanced intermediate fusion methods, to reveal how they enable AI models to utilize multimodal data more effectively for refined and accurate decision-making.
Starting Papers:
Abstract:
In reinforcement learning (RL), an agent learns the optimal sequence of actions through trial and error in an unknown environment. In this context, goal-conditioned hierarchical RL is an approach that segments complex learning tasks into smaller and distinct modules. It integrates a hierarchical decomposition with goal-directed learning strategies and operates on a multi-tier system: at higher levels, policies are formulated to set intermediate sub-goals that are intrinsically aligned with the overarching objective of the task. At lower levels, separate policies are trained to accomplish these sub-goals, based on the immediate state and goal information. This seminar is supposed to give an overview of recent advancements and challenges in the field.
Starting Papers
Abstract:
In the rapidly evolving field of artificial intelligence, Transformer models have emerged as a groundbreaking innovation, offering unprecedented advancements in natural language processing and beyond. This seminar delves into the intricate world of Transformer models, providing a comprehensive overview of their design choices and the profound effects these decisions have on their performance.
The seminar will begin with an exploration of the key design elements that define Transformer models. We will discuss the architecture, including the role of attention mechanisms, layer configurations, and embedding techniques, and how these choices influence the model's ability to process and interpret complex data. This segment aims to shed light on the rationale behind various design decisions and their implications for model efficiency and effectiveness.
The final segment of the seminar will focus on the limitations and challenges faced by Transformer models. We will identify and analyze the mechanisms that lead to the failure of these models, exploring areas where they struggle to learn and adapt. This will include a discussion on issues related to data dependency, biases in training datasets, and the challenges in handling ambiguous or novel scenarios. By understanding these pitfalls, we aim to provide a roadmap for future research and development in the field.