How to Modify the instruction prompt of modelgraded evals #1428
Unanswered
kapilmayank
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I was evaluating LLAMA2 for text2sql scenario which is using modelgraded eval
I am getting a warning that choices Correct,Incorrect unparasable by cot classify but this was working with openai models
because of this warning model graded evaluation is breaking , i am getting invalid i.e Any other choices returned by the model are parsed into "invalid"
The problem is I am not able to modify the instruction prompt for model evaluation i.e 'Compare the submitted sql query with the expert query '
Could someone please help me from where i have to you know modify the Instruction prompt
Instruction prompt-[{'role': 'user', 'content': 'You are comparing a submitted answer to an expert answer on a given SQL coding question. Here is the data:\n[BEGIN DATA]\n************\n[Question]: TASK: Answer the following question with syntactically correct SQLite SQL and answer should only have SQL query.\n Table vehicleinsurance, columns = [Customer_id,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Vehicle_Accident_Date]\n\nUser: Q: list the customers who had vandalization in the last year and whose premium is more than $50000\n\n************\n[Expert]: ['A: SELECT Customer_id FROM vehicleinsurance WHERE Vehicle_Damage = Yes AND Vehicle_Accident_Date >= DATEADD(year, -1, GETDATE()) AND Annual_Premium > 50000']\n************\n[Submission]: Sure! Here's the SQL query to list the customers who had vandalization in the last year and whose premium is more than $50000:\n\nSELECT \nFROM vehicleinsurance\nWHERE (\n SELECT COUNT()\n FROM vehicleinsurance\n WHERE insurance_date > DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)\n AND damage = 'Vandalization'\n) > 0 AND annual_premium > 50000;\n************\n[END DATA]\n\nCompare the content and correctness of the submitted SQL with the expert answer. Ignore any differences in whitespace, style, or output column names.\nThe submitted answer may either be correct or incorrect. Determine which case applies. Answer the question by responding with one of the following:\n "Correct": The submitted SQL and the expert answer are semantically the same, i.e. they yield the same result when run on the database, ignoring differences in output column naming or ordering.\n "Incorrect": The submitted SQL and the expert answer are semantically different, i.e. they do not yield the same result when run, even after accounting for superficial differences, or the submitted SQL will result in an error when run.\n\nFirst, write out in a step by step manner your reasoning to be sure that your conclusion is correct. Avoid simply stating the correct answer at the outset. Then print only a single choice from "Correct" or "Incorrect" (without quotes or punctuation) on its own line corresponding to the correct answer. At the end, repeat just the answer by itself on a new line.\n\nReasoning:'}]
text2ql_eval:
id: text2ql_eval.text2ql_eval.v1
metrics:
text2ql_eval.text2ql_eval.v1:
args:
samples_jsonl: custom_data/text2sql.jsonl
eval_type: cot_classify
modelgraded_spec: sql
class: evals.elsuite.modelgraded.classify:ModelBasedClassify
Beta Was this translation helpful? Give feedback.
All reactions