Merge pull request #162 from acon96/release/v0.3.2

Release v0.3.2
acon96 · Jun 8, 2024 · 9f7aa19 · 9f7aa19
2 parents f407e53 + 8f51188
commit 9f7aa19
Show file tree

Hide file tree

Showing 11 changed files with 193 additions and 74 deletions.
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -7,7 +7,13 @@ assignees: ''
 
 ---
 
-***Please do not report issues with the model generating incorrect output. This includes any instance where the model responds with `Failed to run: ...` or outputs badly formatted responses. If you are having trouble getting the correct output from the model, please open a Discussion thread instead.***
+<!-- 
+
+Please do not report issues with the model generating incorrect output. This includes any instance where the model responds with `Failed to run: ...` or outputs badly formatted responses. If you are having trouble getting the correct output from the model, please open a Discussion thread instead.
+
+If you recently updated Home Assistant to a newly released version, please indicate that in your report.
+
+-->
 
 **Describe the bug**  
 A clear and concise description of what the bug is.

diff --git a/README.md b/README.md
@@ -132,6 +132,7 @@ In order to facilitate running the project entirely on the system where Home Ass
 ## Version History
 | Version | Description                                                                                                                                                                                                          |
 |---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| v0.3.2  | Fix for exposed script entities causing errors, fix missing GBNF error, trim whitespace from model output                                                                                                            |
 | v0.3.1  | Adds basic area support in prompting, Fix for broken requirements, fix for issue with formatted tools, fix custom API not registering on startup properly                                                            |
 | v0.3    | Adds support for Home Assistant LLM APIs, improved model prompting and tool formatting options, and automatic detection of GGUF quantization levels on HuggingFace                                                   |
 | v0.2.17 | Disable native llama.cpp wheel optimizations, add Command R prompt format                                                                                                                                            |

diff --git a/custom_components/llama_conversation/agent.py b/custom_components/llama_conversation/agent.py
@@ -394,10 +394,12 @@ async def async_process(
 
             try:
                 tool_response = await llm_api.async_call_tool(tool_input)
+                _LOGGER.debug("Tool response: %s", tool_response)
             except (HomeAssistantError, vol.Invalid) as e:
                 tool_response = {"error": type(e).__name__}
                 if str(e):
                     tool_response["error_text"] = str(e)
+                _LOGGER.debug("Tool response: %s", tool_response)
 
                 intent_response = intent.IntentResponse(language=user_input.language)
                 intent_response.async_set_error(
@@ -408,8 +410,6 @@ async def async_process(
                     response=intent_response, conversation_id=conversation_id
                 )
 
-            _LOGGER.debug("Tool response: %s", tool_response)
-
         # handle models that generate a function call and wait for the result before providing a response
         if self.entry.options.get(CONF_TOOL_MULTI_TURN_CHAT, DEFAULT_TOOL_MULTI_TURN_CHAT):
             conversation.append({"role": "tool", "message": json.dumps(tool_response)})
@@ -436,7 +436,7 @@ async def async_process(
 
         # generate intent response to Home Assistant
         intent_response = intent.IntentResponse(language=user_input.language)
-        intent_response.async_set_speech(to_say)
+        intent_response.async_set_speech(to_say.strip())
         return ConversationResult(
             response=intent_response, conversation_id=conversation_id
         )
@@ -672,7 +672,8 @@ def expose_attributes(attributes) -> list[str]:
                 "state": state,
                 "attributes": exposed_attributes,
                 "area_name": attributes.get("area_name"),
-                "area_id": attributes.get("area_id")
+                "area_id": attributes.get("area_id"),
+                "is_alias": False
             })
             if "aliases" in attributes:
                 for alias in attributes["aliases"]:
@@ -683,17 +684,25 @@ def expose_attributes(attributes) -> list[str]:
                         "state": state,
                         "attributes": exposed_attributes,
                         "area_name": attributes.get("area_name"),
-                        "area_id": attributes.get("area_id")
+                        "area_id": attributes.get("area_id"),
+                        "is_alias": True
                     })
 
         if llm_api:
             if llm_api.api.id == HOME_LLM_API_ID:
                 service_dict = self.hass.services.async_services()
                 all_services = []
+                scripts_added = False
                 for domain in domains:
                     # scripts show up as individual services
-                    if domain == "script":
-                        all_services.extend(["script.reload()", "script.turn_on()", "script.turn_off()", "script.toggle()"])
+                    if domain == "script" and not scripts_added:
+                        all_services.extend([
+                            ("script.reload", vol.Schema({}), ""),
+                            ("script.turn_on", vol.Schema({}), ""),
+                            ("script.turn_off", vol.Schema({}), ""),
+                            ("script.toggle", vol.Schema({}), ""),
+                        ])
+                        scripts_added = True
                         continue
 
                     for name, service in service_dict.get(domain, {}).items():

diff --git a/custom_components/llama_conversation/config_flow.py b/custom_components/llama_conversation/config_flow.py
@@ -636,18 +636,34 @@ async def async_step_model_parameters(
         schema = vol.Schema(local_llama_config_option_schema(self.hass, selected_default_options, backend_type))
 
         if user_input:
+            if not user_input.get(CONF_REFRESH_SYSTEM_PROMPT) and user_input.get(CONF_PROMPT_CACHING_ENABLED):
+                errors["base"] = "sys_refresh_caching_enabled"
+
+            if user_input.get(CONF_USE_GBNF_GRAMMAR):
+                filename = user_input.get(CONF_GBNF_GRAMMAR_FILE, DEFAULT_GBNF_GRAMMAR_FILE)
+                if not os.path.isfile(os.path.join(os.path.dirname(__file__), filename)):
+                    errors["base"] = "missing_gbnf_file"
+                    description_placeholders["filename"] = filename
+
+            if user_input.get(CONF_USE_IN_CONTEXT_LEARNING_EXAMPLES):
+                filename = user_input.get(CONF_IN_CONTEXT_EXAMPLES_FILE, DEFAULT_IN_CONTEXT_EXAMPLES_FILE)
+                if not os.path.isfile(os.path.join(os.path.dirname(__file__), filename)):
+                    errors["base"] = "missing_icl_file"
+                    description_placeholders["filename"] = filename
+
             if user_input[CONF_LLM_HASS_API] == "none":
                 user_input.pop(CONF_LLM_HASS_API)
 
-            try:
-                # validate input
-                schema(user_input)
-
-                self.options = user_input
-                return await self.async_step_finish()
-            except Exception as ex:
-                _LOGGER.exception("An unknown error has occurred!")
-                errors["base"] = "unknown"
+            if len(errors) == 0:
+                try:
+                    # validate input
+                    schema(user_input)
+
+                    self.options = user_input
+                    return await self.async_step_finish()
+                except Exception as ex:
+                    _LOGGER.exception("An unknown error has occurred!")
+                    errors["base"] = "unknown"
 
         return self.async_show_form(
             step_id="model_parameters", data_schema=schema, errors=errors, description_placeholders=description_placeholders,

diff --git a/custom_components/llama_conversation/const.py b/custom_components/llama_conversation/const.py
@@ -323,5 +323,5 @@
     }
 }
 
-INTEGRATION_VERSION = "0.3.1"
+INTEGRATION_VERSION = "0.3.2"
 EMBEDDED_LLAMA_CPP_PYTHON_VERSION = "0.2.77"
diff --git a/custom_components/llama_conversation/manifest.json b/custom_components/llama_conversation/manifest.json
@@ -1,7 +1,7 @@
 {
   "domain": "llama_conversation",
   "name": "Local LLM Conversation",
-  "version": "0.3.1",
+  "version": "0.3.2",
   "codeowners": ["@acon96"],
   "config_flow": true,
   "dependencies": ["conversation"],

diff --git a/custom_components/llama_conversation/output.gbnf b/custom_components/llama_conversation/output.gbnf
@@ -0,0 +1,29 @@
+root   ::= (tosay "\n")+ functioncalls?
+
+tosay ::= [0-9a-zA-Z #%.?!]*
+functioncalls ::=
+  "```homeassistant\n" (object ws)* "```"
+
+value  ::= object | array | string | number | ("true" | "false" | "null") ws
+object ::=
+  "{" ws (
+            string ":" ws value
+    ("," ws string ":" ws value)*
+  )? "}" ws
+
+array  ::=
+  "[" ws (
+            value
+    ("," ws value)*
+  )? "]" ws
+
+string ::=
+  "\"" (
+    [^"\\] |
+    "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes
+  )* "\"" ws
+
+number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws
+
+# Optional space: by convention, applied in this grammar after literal chars when allowed
+ws ::= ([ \t\n] ws)?
diff --git a/custom_components/llama_conversation/translations/en.json b/custom_components/llama_conversation/translations/en.json
@@ -9,7 +9,10 @@
             "missing_model_file": "The provided file does not exist.",
             "other_existing_local": "Another model is already loaded locally. Please unload it or configure a remote model.",
             "unknown": "Unexpected error",
-            "pip_wheel_error": "Pip returned an error while installing the wheel! Please check the Home Assistant logs for more details."
+            "pip_wheel_error": "Pip returned an error while installing the wheel! Please check the Home Assistant logs for more details.",
+            "sys_refresh_caching_enabled": "System prompt refresh must be enabled for prompt caching to work!",
+            "missing_gbnf_file": "The GBNF file was not found: {filename}",
+            "missing_icl_file": "The in context learning example CSV file was not found: {filename}"
         },
         "progress": {
             "download": "Please wait while the model is being downloaded from HuggingFace. This can take a few minutes.",
@@ -157,8 +160,8 @@
         },
         "error": {
             "sys_refresh_caching_enabled": "System prompt refresh must be enabled for prompt caching to work!",
-            "missing_gbnf_file": "The GBNF file was not found: '{filename}'",
-            "missing_icl_file": "The in context learning example CSV file was not found: '{filename}'"
+            "missing_gbnf_file": "The GBNF file was not found: {filename}",
+            "missing_icl_file": "The in context learning example CSV file was not found: {filename}"
         }
     },
     "selector": {

diff --git a/docs/Setup.md b/docs/Setup.md
@@ -14,6 +14,11 @@
     * [Step 1: Downloading and serving the Model](#step-1-downloading-and-serving-the-model)
     * [Step 2: Connect to the Ollama API](#step-2-connect-to-the-ollama-api)
     * [Step 3: Model Configuration](#step-3-model-configuration-1)
+* [Path 3: Using Llama-3-8B-Instruct with LM Studio](#path-3-using-llama-3-8b-instruct-with-lm-studio)
+    * [Overview](#overview-2)
+    * [Step 1: Downloading and serving the Model](#step-1-downloading-and-serving-the-model-1)
+    * [Step 2: Connect to the LM Studio API](#step-2-connect-to-the-lm-studio-api)
+    * [Step 3: Model Configuration](#step-3-model-configuration-2)
 * [Configuring the Integration as a Conversation Agent](#configuring-the-integration-as-a-conversation-agent)
 * [Finished!](#finished)
 
@@ -103,6 +108,41 @@ Once the desired API has been selected, scroll to the bottom and click `Submit`.
 
 > NOTE: The key settings in this case are that our prompt references the `{{ response_examples }}` variable and the `Enable in context learning (ICL) examples` option is turned on.
 
+## Path 3: Using Llama-3-8B-Instruct with LM Studio
+### Overview
+Another model you can use if you have a GPU is Meta's Llama-3-8B-Instruct Model. This path assumes you have a machine with a GPU that already has [LM Studio](https://lmstudio.ai/) installed on it.  This path utilizes in-context learning examples, to prompt the model to produce the output that we expect.
+
+### Step 1: Downloading and serving the Model
+Llama 3 8B can be set up and downloaded on the serving machine using LM Studio by:
+1. Search for `lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF` in the main interface.
+2. Select and download the version of the model that is recommended for your VRAM configuration.
+3. Select the 'Local Server' tab on the left side of the application.
+4. Load the model by selecting it from the bar in the top middle of the screen. The server should start automatically when the model finishes loading.
+5. Take note of the port that the server is running on.
+
+### Step 2: Connect to the LM Studio API
+
+1. In Home Assistant: navigate to `Settings > Devices and Services`
+2. Select the `+ Add Integration` button in the bottom right corner
+3. Search for, and select `Local LLM Conversation`
+4. Select `Generic OpenAI Compatible API` from the dropdown and click `Submit`
+5. Set up the connection to the API:
+    - **IP Address**: Fill out IP Address for the machine hosting LM Studio
+    - **Port**: enter the port that was listed in LM Studio
+    - **Use HTTPS**: unchecked
+    - **Model Name**: This can be any value, as LM Studio uses the currently loaded model for all incoming requests.
+    - **API Key**: leave blank
+6. Click `Submit`
+
+### Step 3: Model Configuration
+This step allows you to configure how the model is "prompted". See [here](./Model%20Prompting.md) for more information on how that works.
+
+For now, defaults for the model should have been populated. If you would like the model to be able to control devices then you must select the `Assist` API.
+
+Once the desired API has been selected, scroll to the bottom and click `Submit`.
+
+> NOTE: The key settings in this case are that our prompt references the `{{ response_examples }}` variable and the `Enable in context learning (ICL) examples` option is turned on.
+
 ## Configuring the Integration as a Conversation Agent
 Now that the integration is configured and providing the conversation agent, we need to configure Home Assistant to use our conversation agent instead of the built in intent recognition system.