Content

Speaker

Jacqueline Lane

Abstract

The rise of generative artificial intelligence (AI) is transforming creative problem-solving, necessitating new approaches for evaluating innovative solutions. This study explores how human-AI collaboration can enhance early-stage evaluations, focusing on the interplay between objective criteria, which are quantifiable, and subjective criteria, which rely on personal judgment. We conducted a field experiment with MIT Solve, involving 72 experts and 156 community screeners who evaluated 48 solutions for the 2024 Global Health Equity Challenge. Screeners received assistance from GPT-4, offering recommendations and, in some cases, a rationale. We compared a human-only control group with two AI-assisted treatments: a black box AI and a narrative AI with probabilistic explanations justifying its decisions. Our findings show that AI-assisted expert screeners were 9 percentage points more likely to fail a solution. There was no significant difference between the black box and narrative AI conditions for objective criteria. However, screeners adhered to narrative AI’s recommendations 12 percentage points more often than the black box AI’s for subjective criteria. These effects were consistent across both experts and non-experts. Further investigating the attitude of screeners using mouse tracking data, we found that deeper engagement with AI’s objective failure recommendations led to more overrides, especially in the narrative AI condition, indicating increased scrutiny. This research underscores the importance of developing AI interaction expertise in creative evaluation processes that combine human judgment with AI insights. While AI can standardize decision-making for objective criteria, human oversight and critical thinking remain indispensable in subjective assessments, where AI should complement, not replace, human judgment.

In person event posted in Research