Mohammed Safi Ur Rahman Khan

PhD Scholar • DSAIAI4BharatIIT Madras


AI4Bharat Lab

Indian Institute of Technology, Madras

Chennai, Tamil Nadu, India

I am Safi (صفی), a first year PhD Student at the Wadhwani School of Data Science and Artificial Intelligence (DSAI) @ IIT Madras & AI4Bharat Lab where I am advised by Prof. Mitesh M. Khapra. My current research focuses on “all” Data and Evaluation of Large Language Models.

Previously, I was an AI Resident at the AI4Bharat Lab at IIT Madras, where I was fortunate to be part of the IndicLLMSuite (guided by Prof. Mitesh M. Khapra). I did my M.Tech in Computer Science and Engineering before that from IIT Madras (again!!) where I got to work on “Narrow Domain Adaptation of Speech Recognition Systems” guided by Prof. Pratyush Kumar and (you guessed it!) Prof. Mitesh M. Khapra.


Aug 14, 2024 IndicLLMSuite wins the Outstanding paper award 🏅 at ACL 2024!!
Jul 25, 2024 I’ll be attending ACL 2024, Bangkok 🇹🇭 to present IndicLLMSuite. Lets catch up!!
Jul 01, 2024 Joined as a PhD Scholar at the Wadhwani School of Data Science and AI @ IIT Madras
Jun 19, 2024 Our work “FBI: Finding Blindspots in Evaluator LLMs with Interpretable Checklists” is out on ArXiv.
May 16, 2024 Our work “IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages” has been accepted to ACL 2024 main conference.

latest posts

Mar 14, 2024 IndicLLM Suite

selected publications

  1. arXiv
    Finding Blind Spots in Evaluator LLMs with Interpretable Checklists
    Sumanth Doddapaneni*Mohammed Safi Ur Rahman Khan*, Sshubam Verma, and Mitesh M. Khapra
    arXiv preprint arXiv: 2406.13439, 2024
  2. ACL 2024
    ACL-2024 Outstanding Paper Award
    IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
    Mohammed Safi Ur Rahman Khan*, Priyam Mehta*, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad G, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, and Mitesh M. Khapra
    Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024