LLM Testing

青少年在线精英教育平台

所属课程：人工智能大变革(16星期,32课时)

*yden Zong

12岁半

发布于：18天前

浏览数：15

0 赞

Question 1: Which model has the highest total score in your test results? What do you think are its advantages and disadvantages compared to the other?

Model B had the higher test score in my test results. I think the reason LLM B had a better test score was because of it's overally strength in math. While A and B are virtually the same on every other aspect giving the same answer, or very similar answers.

Question 2: Which model do you think is more suitable for you? Why?

I think Model B is more suitable for me as overall on every test subject Model B matched with Model A, but on math it was even better than Model A. This is why I think Model B was more suitable for me.

Question 3: What new understanding did this test experiment give you about the artificial intelligence large language model ?

This test showed me that AI language models/LLMs are good at language tasks and simple logic, as well as many others. LLM's can also have trouble with hard math and staying consistent when questions are worded differently as proved in example/test 4. It also taught me that testing AI with different kinds of questions, or even different models helps to understand AI's strengths and weaknesses better.

附件： 大语言模型测试实验表--评分简化版 (1).xlsx

你还没有登录，请先登录或注册！

还没有人评论，欢迎说说您的想法！

LLM Testing

WeChat & Official Accounts 微信&公众号

WeChat & Official Accounts
微信&公众号