大贤者
精华
|
战斗力 鹅
|
帖子
注册时间 2014-11-27
|
本帖最后由 诚司 于 2024-4-19 23:03 编辑
这llama3 70B tm的绝了,lmsys上它绝对被低估了,打榜上它和command R+差不多,但实测它比command R+强多了,哪怕是中文prompt,只不过llama3不喜欢说中文而已
llama3 70B完爆 llama3 8B,8B用英文提示词就能答上来一些东西,中文就不行,但70B的没这个问题,而Command R+参数比70B多也一样有这种问题
多轮工具调用我也试了试,llama3 70B至少是大杯claude水平,Sonnet claude是被llama3完爆的
这玩意开源了真tm绝了
ps:
试了两个类似TPTU文章里那样,但是更复杂一点的例子,GPT4和claude opus都容易答错,llama3在中文问题的debuff情况下都能答对……以前根本不敢想这种程度的tool planning
- You are a strategy model and given a problem and a set of tools,you need to generate a sequence of executable tools to determine the solution to the problem.
- Each tool in the toolset is defined as follows:
- SQL Generator: Given an input problem and a database,create a syntactically correct SQLite query statement. Note that the table here, only contrain the following field: 'name': this is the name of the book, 'price' this is the price of the book and the number of the price in the table is represented in the price of RMB(Chinese yuan), 'number': this is the number of books in this library
- PythonREPL:Given an input problem and some information,generate a syntactically correct Python code. In Python code. there is a global variable 'rate' which represents the exchange rate between US dollars and Chinese yuan.
- Please use the following format:
- Question: Here is the question
- Error: Here is the previously generated error output
- Tasks:Here is a Python List type,where each item in the List is a dictionary.The key of the dictionary represents the selected tool, and the value is the query input when calling the tool. Please note that the generated Tool and Query should be different from those in the Error.
- Here are some examples mapping the question to the tools:
- Question: What is the the number of albums by Jolin Tsai?
- Error: None
- Tasks:[{{SQL Generator:"What is the number of albums by Jolin Tsai?"}}]
- Question: What is the square of the number of albums by Jolin Tsai?
- Error: None
- Tasks:[{{SQL Generator:"What is the number of albums by Jolin Tsai?"}},
- {{PythonREPL:"What is the square of the number of albums by Jolin Tsai?"}}]
- Question: How many books are cheaper than 10× 20 dolars? Please find the number of books and output the square of the number.
- Error: None
- Tasks:[{{PythonREPL:"What is 10× 20 ?"}}
- {{SQL Generator:"How many books are cheaper than 10× 20 dolars?"}},
- {{PythonREPL:"Output the square of the number above"}}]
- Question:First,calculate the square of 40 and denote it as A.Then,find the names of all artists with a total number of fans less than A.
- Error: None
- Tasks:[{{PythonREPL:"Let A be the square of 40.What is the value of A?"}},{{SQL Generator:"Find the names of all artists with a total number of fans less than A"}}]
- Note that you must ensure that the generated Tasks strictly adhere to the format requirements: they must be in Python List type,where each item is a dictionary.The key of the dictionary represents the selected tool, and the value is the query input when calling the tool.
- Now,let's proceed:
- Question: 从文件 'a.txt' 中读取一个数字。找出那些书名长度大于这个数字的减去20的所有书籍,将这些书名字符串写入到文件 'book_name.txt' 中。然后再查询那些书的价格以美元计算比之前从'a.txt’里读取的数字更贵的那些书的书名,写入到'test.txt'中
- Error: None
- Tasks:
复制代码- You are a strategy model and given a problem and a set of tools,you need to generate a sequence of executable tools to determine the solution to the problem.
- Each tool in the toolset is defined as follows:
- SQL Generator: Given an input problem and a database,create a syntactically correct SQLite query statement. Note that the table here, only contrain the following field: 'name': this is the name of the book, 'price' this is the price of the book and the number of the price in the table is represented in the price of RMB(Chinese yuan), 'number': this is the number of books in this library. 'rate' : This is the exchange rate between US dollars and Chinese yuan.
- PythonREPL:Given an input problem and some information,generate a syntactically correct Python code.
- Please use the following format:
- Question: Here is the question
- Error: Here is the previously generated error output
- Tasks:Here is a Python List type,where each item in the List is a dictionary.The key of the dictionary represents the selected tool, and the value is the query input when calling the tool. Please note that the generated Tool and Query should be different from those in the Error.
- Here are some examples mapping the question to the tools:
- Question: What is the the number of albums by Jolin Tsai?
- Error: None
- Tasks:[{{SQL Generator:"What is the number of albums by Jolin Tsai?"}}]
- Question: What is the square of the number of albums by Jolin Tsai?
- Error: None
- Tasks:[{{SQL Generator:"What is the number of albums by Jolin Tsai?"}},
- {{PythonREPL:"What is the square of the number of albums by Jolin Tsai?"}}]
- Question: How many books are cheaper than 10× 20 dolars? Please find the number of books and output the square of the number.
- Error: None
- Tasks:[{{PythonREPL:"What is 10× 20 ?"}}
- {{SQL Generator:"How many books are cheaper than 10× 20 dolars?"}},
- {{PythonREPL:"Output the square of the number above"}}]
- Question:First,calculate the square of 40 and denote it as A.Then,find the names of all artists with a total number of fans less than A.
- Error: None
- Tasks:[{{PythonREPL:"Let A be the square of 40.What is the value of A?"}},{{SQL Generator:"Find the names of all artists with a total number of fans less than A"}}]
- Note that you must ensure that the generated Tasks strictly adhere to the format requirements: they must be in Python List type,where each item is a dictionary.The key of the dictionary represents the selected tool, and the value is the query input when calling the tool.
- Now,let's proceed:
- Question: 从文件 'a.txt' 中读取一个数字。找出那些书名长度大于这个数字的减去20的所有书籍,将这些书名字符串写入到文件 'book_name.txt' 中。然后再查询那些书的价格以美元计算比之前从'a.txt’里读取的数字更贵的那些书的书名,写入到'test.txt'中
- Error: None
- Tasks:
复制代码
|
|