Yuling Gu
yulingg@allenai.org
UPDATES
Upcoming/current activities:
- August 2024: Attending ACL 2024! Check out the following papers that I'm part of:
(1) "OLMo: Accelerating the Science of Language Models" (Poster session #3 - Aug 12 @ 4pm; Talk @ Special Theme - Aug 14 @ 11am)
(2) "Digital Socrates: Evaluating LLMs through Explanation Critiques" (Poster session #5 - Aug 13 @ 4pm)
(3) "PROC2PDDL: Open-Domain Planning Representations from Texts" (NLRSE workshop - Aug 15 @ noon)
Past activites highlight:
- December 2023: Attended EMNLP 2023 to share our work "What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations" & "Robust Tooling and New Resources for Large Language Model Evaluation via Catwalk"
- July 2023: Presented our work "Do language models have coherent mental models of everyday things?" in person at ACL 2023!
- May 2023: Our paper "Do language models have coherent mental models of everyday things?" got accepted to ACL 2023!
- May 2023: Our paper "Can AI language models replace human participants?" has been published in Trends in Cognitive Sciences!
- Late November - Early December 2022: Attending NeurIPS and EMNLP 2022! Virtually :)
- October 2022: Our paper "One Venue, Two Conferences: The Separation of Chinese and American Citation Networks" got accepted to the AI Cultures Workshop at NeurIPS 2022!
- October 2022: Our paper on "DREAM-FLUTE" got accepted to the Figurative Language Processing Workshop at EMNLP 2022!
- August 2022: We developed "DREAM-FLUTE" during a 3-day Hackathon at AI2 and it achieved (joint) first place for the Figurative Language Understanding Shared Task at EMNLP 2022!
- July 2022: Presented our work in person at NAACL 2022!
- April 2022: Our paper "DREAM: Improving Situational QA by First Elaborating the Situation" got accepted to NAACL 2022!
- April 2022: Joined the Aristo team at Allen Institute for AI as a Predoctoral Young Investigator!
- March 2022: Graduated from UW with a perfect GPA!
- Summer 2021: Research Intern on the Aristo team at Allen Institute for AI.
- Early August 2021: Presented (virtually) at the Unimplicit workshop (at ACL-IJCNLP 2021).
- Late October 2020: Presented (virtually) at Interspeech 2020.
- Late September 2020: Joined UW to begin my graduate studies!
- July 2020: My paper on Singaporean children's speech got accepted at Interspeech 2020!
- May 2020: Graduated summa cum laude from NYU!
- Early December 2019: San Diego, California for 178th Meeting of the Acoustical Society of America (2 Poster Presentations)
- Late October 2019: NYU CAS alumni-student debate. The Motion: "The Benefits of the Development of Artificial Intelligence Outweigh the Harms."
- Early October 2019: Orlando, FL for the Grace Hopper Celebration (1 of only 4 undergraduate representatives from Courant Institute of Mathematical Sciences)
- Late July 2019: Florence, Italy for ACL 2019 (Poster presentation)
Other interesting things:
- My undergraduate honors thesis advisor at NYU, Prof. Ernest Davis, published Rebooting AI: Building Artificial Intelligence We can Trust. Check it out!
RESEARCH
Projects
- Worked on language model + reasoner architecture, reasoning, LLM evaluation, and more! Current research focus include: Best practices for LLM evaluation, high quality dataset for model evaluation, etc
- When people answer questions about a specific situation, cognitive science suggests that they form a mental picture of that situation before answering. We train a new model, DREAM, to build such scene elaborations in a dataset-neutral way. We then demonstrate that using DREAM’s scene elaborations as additional context improves the answer accuracy across different downstream QA systems and on different end-tasks. Our approach is question-agnostic, leaves end-task QA models unchanged, and thus easily portable to other QA models, suggesting exciting opportunities for further improving and exploiting scene elaborations to better solve new problems.
- Supervised by Prof. Ernest Davis. Use various classifiers, word and sentence representations, as well as linguistics theories to automatically detect temporal relations implicitly conveyed in texts (different levels: from single event description to multiple sentences); Analyze the performance of Transformer-based state-of-the-art models in detecting implicit meaning from a psycholinguistics perspective.
- Supervised by Prof. Ralph Grishman. Investigate the contribution of information from dependency parsing, Named Entity (NE) tagging, and Part Of Speech (POS) tagging in event extraction, beyond a baseline that uses pretrained BERT sentence representation.
- Supervised by Prof. Ralph Grishman. Experiment with different classifiers, together with grammatical linguistics insights, to automatically distinguish prepositional phrases as adjuncts or arguments (achieved 88% accurate prediction of the adjunct/argument distinction using linguistics theories alone).
- Supervised by Prof. Adam Meyers. Refine the English Termolator's distributional metrics; Further develop the Chinese Termolator; Integrate past 5 years' developments to unify the two systems (my contributions: https://github.com/yulinggu-cs/ChineseTermolator2020, integrated to full system on July 2020).
- Supervised by Prof. Ernest Davis. Look into English-Chinese Machine Translation failures; Design Winograd schemas and compile pronoun disambiguation problems; Toward Annotating Commonsense Inferences in Text (TACIT) annotation.
- Characterizing Singaporean, American, and British English acoustic and pronunciation patterns in children's speech using unsupervised clustering (supervised by Dr. Nancy F. Chen); Chinese tone perception in Singaporean and native Chinese Mandarin speakers; Investigating tone in whispered Mandarin (jointly supervised by Dr. Boon Pang Lim and Dr. Nancy F. Chen).
Predoctoral Young Investigator: Aristo team, Allen Institute for Artificial Intelligence (April 2022 - present)
Research Intern: Aristo team, Allen Institute for Artificial Intelligence (Summer & Fall 2021)
Research assistant: Courant Institute of Mathematical Sciences, NYU (Summer 2018 - Spring 2020)
Honors Thesis Project: Detecting Event Duration in Text (Spring 2019 - Spring 2020)
Can dependency parsing help event extraction in text? (Fall 2019 - Spring 2020)
Integrated Customization Environment for Information Extraction (ICE) (Summer 2019)
Termolator: A terminology extraction system (Summer 2018 - Fall 2018)
Independent study project: Commonsense Reasoning (Summer 2018)
Research Intern: Human Language Technology Group (Winter 2014 - Spring 2021)
Institute for Infocomm Research, A*STAR, Singapore, Singapore
Other work experience
Courant Institute of Mathematical Sciences (CIMS), NYU
Grader for Artificial Intelligence course under Professor Ernest Davis (Fall 2019)
Grader for Basic Algorithms course under Professor Victor Shoup (Spring 2019)
PUBLICATIONS
- Yuling Gu, Oyvind Tafjord, Bailey Kuehl, Dany Haddad, Jesse Dodge and Hannaneh Hajishirzi (2024). “OLMES: A Standard for Language Model Evaluations”. arXiv. [arXiv] [Curated in-context examples & Code]
- Wenlong Zhao, Debanjan Mondal, Niket Tandon, Danica Dillion, Kurt Gray and Yuling Gu (2024). “WorldValuesBench: A Large-Scale Benchmark Dataset for Multi-Cultural Value Awareness of Language Models”. LREC-COLING 2024. [Paper] [Dataset & Code]
- Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch and Niket Tandon (2024). “PROC2PDDL: Open-Domain Planning Representations from Texts”. The 2nd Workshop on Natural Language Reasoning and Structured Explanations, ACL 2024. [Paper] [Dataset & Code]
- Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith and Hannaneh Hajishirzi (2024). “OLMo: Accelerating the Science of Language Models”. ACL 2024. [arXiv] [Website]
- Yuling Gu, Oyvind Tafjord and Peter Clark (2024). “Digital Socrates: Evaluating LLMs through Explanation Critiques”. ACL 2024. [arXiv] [Dataset & Model]
- Kavel Rao, Liwei Jiang, Valentina Pyatkin, Yuling Gu, Niket Tandon, Nouha Dziri, Faeze Brahman and Yejin Choi (2023). “What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations”. Findings of EMNLP 2023. [Paper] [Dataset]
- Yuling Gu, Bhavana Dalvi Mishra and Peter Clark (2023). “Do language models have coherent mental models of everyday things?”. ACL 2023. [Paper] [Dataset & Code]
- Danica Dillion, Niket Tandon, Yuling Gu and Kurt Gray (2023). “Can AI language models replace human participants?”. Trends in Cognitive Sciences. [Paper]
- Yuling Gu (2022). “Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion”. arXiv. [arXiv]
- Bingchen Zhao*, Yuling Gu*, Jessica Zosa Forde and Naomi Saphra (2022). “One Venue, Two Conferences: The Separation of Chinese and American Citation Networks”. AI Cultures Workshop at NeurIPS 2022. [arXiv]
- Yuling Gu, Yao Fu, Valentina Pyatkin, Ian Magnusson, Bhavana Dalvi and Peter Clark (2022). “Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE”. The Third Workshop on Figurative Language Processing, EMNLP 2022. [Paper] [Dataset & Model]
- Yuling Gu, Bhavana Dalvi Mishra and Peter Clark (2022). “DREAM: Improving Situational QA by First Elaborating the Situation”. NAACL 2022. [Paper] [Dataset & Model]
- Yuling Gu and Nancy F. Chen (2022). “Large-Scale Acoustic Characterization of Singaporean Children's English Pronunciation”. arXiv. [arXiv]
- Yuling Gu (2021). “Transformer-based language models and complement coercion: Experimental studies". The First Workshop on Understanding Implicit and Underspecified Language at ACL-IJCNLP 2021. [Underline link] [Poster]
- Yuling Gu and Nancy F. Chen (2020). “Characterization of Singaporean Children's English: Comparisons to American and British Counterparts using Archetypal Analysis”. Interspeech 2020. [Paper]
- Yuling Gu and Nancy F. Chen (2019). “Large-scale acoustic characterization of mid-low vowels across American, British, and Singaporean children". The Journal of Acoustical Society of America, Volume 146, Issue 4. 178th Meeting of the Acoustical Society of America. [Abstract] [Poster]
- Yuling Gu and Nancy F. Chen (2019). “Acoustic characterization of Singaporean children’s English with American and British counterparts: A case study on approximants". The Journal of Acoustical Society of America, Volume 146, Issue 4. 178th Meeting of the Acoustical Society of America. [Abstract] [Poster]
- Yuling Gu and Nancy F. Chen (2019). “Acoustic Characterization of Singaporean Children’s English: Comparisons to American and British Counterparts”. Widening NLP workshop at ACL 2019. [Paper] [Abstract]
- Yuling Gu, Boon Pang Lim and Nancy F. Chen (2016). “Perception of tone in whispered Mandarin sentences: the case for Singapore Mandarin”. Interspeech 2016. [Paper]
PERSONAL
Always excited to travel, explore new things and reach out for the skies!