AI is constantly evolving. A recent set of ARC AGI scores shows how these AI models have changed over the years (and as you see in 2024 it is changing over the MONTHS!!)
GPT-2 (2019): 0%
GPT-3 (2020): 0%
GPT-4 (2023): 2%
GPT-4o (2024): 5%
o1-preview (2024): 21%
o1 high (2024): 32%
o1 Pro (2024): ~50%
o3 tuned low (2024): 76%
o3 tuned high (2024): 87%
GPT-2 (2019): 0%
Basic language processing tasks like simple text generation, answering straightforward questions, and basic conversation capabilities. Useful for generating generic email responses or basic customer service inquiries.
GPT-3 (2020): 0%
Similar capabilities to GPT-2 but with more refined text generation. Could handle more complex sentence structures and offer slightly more nuanced responses. Still limited to simpler tasks.
GPT-4 (2023): 2%
Begins to handle more complex language tasks like summarizing articles, creating more detailed written content and basic language translation. Could assist in drafting more complex customer service responses and engage in more meaningful conversational exchanges.
GPT-4o (2024): 5%
Enhanced capabilities over GPT-4, including improved dialogue handling in chatbots, better context understanding in conversations and initial capabilities in generating creative content like poetry or simple scripts.
o1-preview (2024): 21%
Significant jump in understanding and generating human-like text. Capable of assisting with high school and college-level educational content, more sophisticated marketing content creation and basic legal or technical document draft assistance.
o1 high (2024): 32%
More refined in its language capabilities, able to handle complex business analytics reports, advanced academic research assistance and more effective in therapeutic chatbots providing mental health support.
o1 Pro (2024): ~50%
Highly advanced language understanding suitable for interpreting and generating professional-level content across fields like law, medicine and engineering. Can create in-depth reports, research papers and participate effectively in complex problem-solving discussions.
o3 tuned low (2024): 76%
This AI can nearly seamlessly integrate into human-like interactions, capable of conducting advanced negotiations, strategic planning and providing expert advice in specialized fields. It could also manage complex customer service environments and sophisticated technical support.
o3 tuned high (2024): 87%
Approaching human-like capabilities in almost all verbal tasks. Can perform advanced medical diagnostic assistance, comprehensive financial forecasting, intricate technical problem solving and creative tasks like writing full-length books or scripting complete film narratives. This level might also support real-time multilingual translation and sophisticated intercultural communication.
So we are approaching a future where AI can perform increasingly complex and critical tasks, potentially revolutionizing industries such as healthcare, finance, education and more.
*ARC AGI score is a sophisticated metric to measure the overall intelligence and robustness of AI models, reflecting their performance across a diverse set of cognitive tasks