-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add from_file
class method to the Prompt
object
#1355
base: main
Are you sure you want to change the base?
Add from_file
class method to the Prompt
object
#1355
Conversation
b2605b3
to
df0eb2c
Compare
Yes, please open an issue 😄 |
f11f10c
to
98cef53
Compare
Added a couple comments, although since this is a draft PR you are probably addressing them already. |
f75319f
to
2954197
Compare
"""Return the schema of a Pydantic model.""" | ||
if not type(model) == type(BaseModel): | ||
if not isinstance(model, type(pydantic.BaseModel)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are unrelated lint fixes, I can eventually move them to another PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep them in this PR as long as they have their own commit.
2954197
to
ee5c4ef
Compare
Working on this PR made me wonder: why do we integrate Jinja2 (or any specific templating engine) into Outlines? I mean, we could have just provided a Prompt interface that takes an already rendered |
I am not sure what you mean by "taking an already rendered |
TL;DR: I mean passing the output/result of the templating process to the import outlines
# Using Jinja2 ...
import jinja2
env = jinja2.Environment(loader=jinja2.FileSystemLoader("."))
template = env.get_template("prompt.jinja2")
# ... or, Using Mako ...
import Mako
template = mako.template.Template(filename="prompt.mako")
@outlines.prompt
def fancy(examples, question):
return template.render(examples=examples, question=question)
examples = [
{"question": "What is the capital of France?", "answer": "Paris"},
{"question": "What is 2 + 2?", "answer": "4"},
]
question = "What is the Earth's diameter?"
prompt = fancy(examples, question)
# ... but why not just?
rendered = template.render(examples=examples, question=question)
prompt = outline.Prompt(rendered)
# EDIT: Nevermind, while writing this down, I realized that this would
# essentially mean that arguments are evaluated too early, leaving the prompt
# without any arguments and simply storing a `str`... |
@@ -183,17 +196,16 @@ def render(template: str, **values: Optional[Dict[str, Any]]) -> str: | |||
A string that contains the rendered template. | |||
""" | |||
jinja_template = create_jinja_template(template) | |||
return jinja_template.render(**values) | |||
return template.render(**values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From interface point of view render
functionality has changed too if defined this way. Before it would accept str
and create jinja template (what's now done by template_from_str
), why was it redefined to accept jinja template right away?
Because I'm not sure of the overall usability of render
if it accepts the jinja template. By reading the test cases and docs with examples it seems this function was here to confirm how the transformation from str
to jinja template looked like. So, by calling it, user would make sure that it was transformed as expected + then rendered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, since now there are 2 separate flows to jinja template:
- from str
- from file
Then it's expected to call render
with jinja template, yes, but this old kind of render
more looks like Prompt.from_str
, which will allow to confirm the str to jinja transformation, and then the actual call will render any of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old design was slightly inefficient because a jinja2.template
should not be constructed more than once per template. Previously, it was being reconstructed for each call to render
...
I understand this could be a potential breaking change. Thinking about it, I had to modify render
-related tests, but I don't think users should be able to call render
without a Prompt
object, WDYT? And yes, it would make sense to expose a from_str
method for parity with from_file
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the render
code so it no longer breaks previous usage, but instead displays a deprecation warning. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I checked new code, from_str
looks good!
Though I wonder about two things:
- since this will be in major update, maybe we can get away even without deprecation warning for whatever we'll decide for
render
just this time - this
render
function still creates more questions of why it exists to me, you both in your examples used interface likeprompt.template.render(params)
, which is much clearer anyway and withfrom_str
available, especially. Also, there is similar logic in__call__
, so maybe this standalonerender
could be dropped all together? It seems it was something historical as well, considering graph mentioning:Parse a Jinaj2 template and translate it into an Outlines graph.
So, I would even remove render
function and integrate and refactor its docs for template_from_str
instead.
@yvan-sraka @rlouf what's your take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, and tbh, I think that's a bit overkill. I'd be fine with not exposing it to the user, as well as other methods. However (and maybe this is where my dusty understanding of Python module visibility falls short), AFAIU, those methods aren't exposed atm, e.g.:
def render(
template: Union[jinja2.Template, str], **values: Optional[Dict[str, Any]]
) -> str:
r"""
Examples
--------
Outlines follow Jinja2's syntax
>>> import outlines
>>> outline = outlines.render("I like {{food}} and {{sport}}", food="tomatoes", sport="tennis")
I like tomatoes and tennis
... snipped"""
% python3
Python 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:27:15) [Clang 11.1.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import outlines
>>> outlines.render("I like {{food}} and {{sport}}", food="tomatoes", sport="tennis")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'outlines' has no attribute 'render'
So, I guess it's the documentation that's rather misleading, right?
EDIT: Never mind, the correct import was outlines.prompts.render
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about keeping the function, displaying a clear deprecation warning, and only removing it in a future release? d318ac2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with either way: removing now or later, this version of deprecation seems better for something we don't want to support 👍
Though, I'd suggest to make sure that we're refactoring all this old docs and tests of render
somehow meaningfully into new template_from_str
since we're doing all that there anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that template_from_str
should be user-facing. Why would the user use it, and how could that be more ergonomic than Prompt.from_str
? What about renaming them to _template_from_str
and _template_from_file
, so it's more obvious that they are internals? Where is the documentation that we should not forget to update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking about integration of these docs/tests of render
into tests of template_from_str
, jjust to preserve these assumptions, but not to make template_from_*
user-facing. I agree, that these are internal, we can make it explicit, yes.
I was also searching for docs, seems they're generated based on docstrings: https://dottxt-ai.github.io/outlines/latest/api/prompts/
You can already use any templating engine you want with Outlines, by rendering the template and passing the returned string to a generator: import outlines
model = outlines.models.openai("gpt-4o")
# Input
examples = [
{"question": "What is the capital of France?", "answer": "Paris"},
{"question": "What is 2 + 2?", "answer": "4"},
]
question = "What is the Earth's diameter?"
# Using Jinja2 ...
import jinja2
env = jinja2.Environment(loader=jinja2.FileSystemLoader("."))
template = env.get_template("prompt.jinja2")
rendered = template.render(examples=examples, question=question)
generator = outlines.generate.text(model)
results = generator(rendered)
# ... or, Using Mako ...
import Mako
template = mako.template.Template(filename="prompt.mako")
rendered = template.render(examples=examples, question=question)
generator = outlines.generate.text(model)
results = generator(rendered) The role of the The confusion may come from poor naming on my part; this object should probably be called |
@@ -135,7 +158,6 @@ def test_prompt_kwargs(): | |||
def test_kwarg_tpl(var, other_var="other"): | |||
"""{{var}} and {{other_var}}""" | |||
|
|||
assert test_kwarg_tpl.template == "{{var}} and {{other_var}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep these two .template
checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because template
is no longer a str
but a jinja2.template
, I could try adapting them instead of suppressing them, however, they seemed a bit redundant to me, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, probably type checks will be enough, though I would suggest adding assertions to verify .signature
: check it to be present from decorator call / from_str
and None otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few more assert
statements to check that the signature
. It's not None
only when the Prompt
is constructed using the decorator (otherwise, it can't infer the signature argument since there is no function).
Thank you very much @rlouf for the clarification. I'm not sure about the naming either, but if it's extensible, that already sounds like a great design! Apologies for sharing ideas maybe a bit too quickly as they pop from my mind, I probably should have checked the documentation first! |
46e150d
to
b0677b7
Compare
b0677b7
to
8b27da3
Compare
No worries! Questions are always welcome |
5aad80d
to
d318ac2
Compare
(Do not merge this PR yet, I will rewrite the commit history a bit once we've settled on deprecating or removing |
5a3771a
to
2b87ecb
Compare
2b87ecb
to
32587ee
Compare
Fix #1345: For now, I'm just trying to make sure I'm editing the right part of the codebase. I haven't managed to run the tests yet…
EDIT: I opened a separate issue about the falling tests 🙂
I noticed that the project use
pytest
, so I assumed that runningpip install ".[test]"
would install all the dependencies needed to run the full test suite. However, it seems that's not the case (it's currently complaining that I don't havevllm
). Should I open an issue (I can add the missing dependency under thetest
section of thepyproject.toml
)?#1356