1. Literature Topic: Explainability of LLMs
- Yao et al., Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- Turpin et al., Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
- Lanham et al., Measuring Faithfulness in Chain-of-Thought Reasoning
- Radhakrishnan et al., Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
2. Literature Topic: Efficiency of LLMs
- Lee et al., Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research
- Touvron et al., LLaMA: Open and Efficient Foundation Language Models
- Dettmers et al., QLoRA: Efficient Finetuning of Quantized LLMs
- Hsieh et al., Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
- Gu et al., Knowledge Distillation of Large Language Models
3. Literature Topic: Agent-Based Modeling via LLMs
- Park et al., Generative Agents: Interactive Simulacra of Human Behavior
- Li et al., CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society
- Boiko et al., Emergent autonomous scientific research capabilities of large language models
- Zhuge et al., Mindstorms in Natural Language-Based Societies of Mind
- Wang et al., Interactive Natural Language Processing
4. Literature Topic: LLMs for the Social Sciences
- .Ziems et al., Can Large Language Models Transform Computational Social Science?
- Feng et al., From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
- Hartmann et al., The political ideology of conversational AI: Converging evidence on ChatGPT's pro-environmental, left-libertarian orientation
5. Literature Topic: Limitations of LLMs
- Frieder et al., Mathematical Capabilities of ChatGPT
- Borji, A Categorical Archive of ChatGPT Failures
- Wang et al., Large Language Models are not Fair Evaluators
- Schick et al., Toolformer: Language Models Can Teach Themselves to Use Tools
- Bang et al., A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
6. Literature Topic: LLMs for Education+Science
- Baidoo-Anu et al., Education in the Era of Generative Artificial Intelligence (AI): Understanding the Potential Benefits of ChatGPT in Promoting Teaching and Learning
- Choi et al., ChatGPT Goes to Law School
- Boiko et al., Emergent autonomous scientific research capabilities of large language models
- Meyer et al., ChatGPT and large language models in academia: opportunities and challenges
7. Literature Topic: Multimodality and LLMs
- Liu et al., Visual instruction tuning
- Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
- InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
8. Experimental Topic: Chain-of-Thought Prompting
- Wei, Jason, et al. “Chain-of-thought prompting elicits reasoning in large language models.” Advances in Neural Information Processing Systems 35 (2022): 24824-24837.
- Kojima, Takeshi, et al. “Large language models are zero-shot reasoners.” Advances in neural information processing systems 35 (2022): 22199-22213.
- Zhang, Zhuosheng, et al. “Automatic chain of thought prompting in large language models.” arXiv preprint arXiv:2210.03493 (2022).
9. Experimental Topic: Knowledge Generation Prompting
- Liu, Jiacheng, et al. “Generated knowledge prompting for commonsense reasoning.” arXiv preprint arXiv:2110.08387 (2021).
10. Experimental Topic: Tree of Thoughts Prompting
- Yao, Shunyu, et al. “Tree of thoughts: Deliberate problem solving with large language models.” arXiv preprint arXiv:2305.10601 (2023).
- Long, Jieyi. “Large Language Model Guided Tree-of-Thought.” arXiv preprint arXiv:2305.08291 (2023).
- Besta et al. “Graph of Thoughts: Solving Elaborate Problems with Large Language Models” arXiv preprint arxiv.org/abs/2308.09687 (2023)
11. Experimental Topic: Plan-and-Solve Prompting
12. Experimental Topic: Automatic Prompt Engineering
- Zhou, Yongchao, et al. “Large language models are human-level prompt engineers.” arXiv preprint arXiv:2211.01910 (2022).
13. Experimental Topic: Data Fusion using LLMs
- Ahmad, Mohammad Shahmeer, et al. “RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes.” arXiv preprint arXiv:2303.16909 (2023).
- Jens Bleiholder and Felix Naumann. 2009. Data fusion. ACM Comput. Surv. 41, 1, Article 1 (January 2009), 41 pages. https://doi.org/10.1145/14
- Narayan, Avanika et. al. 2022. Can Foundation Models Wrangle Your Data? In VLDB2022 (4), 738–746.