A new academic benchmark aims to ‘test the limits of AI knowledge at the frontiers of human expertise.’ So far, these LLMs are stumped.
A new academic benchmark aims to ‘test the limits of AI knowledge at the frontiers of human expertise.’ So far, these LLMs are stumped.