diff --git a/README.md b/README.md
index 4de8918..aaa7018 100644
--- a/README.md
+++ b/README.md
@@ -62,7 +62,8 @@ script; [waycorner][waycorner-icxes].
* [Tip: auto-pausing at the end of subtitle](#tip-auto-pausing-at-the-end-of-subtitle)
* [Alternatives for anime](#alternatives-for-anime)
* [Manga text extraction](#manga-text-extraction)
- * [Setting up “Manga OCR”](#setting-up-manga-ocr)
+ * [Setting up “Manga OCR” Online](#setting-up-manga-ocr-online)
+ * [Setting up “Manga OCR” (Local)](#setting-up-manga-ocr-local)
* [Setting up OCR.space](#setting-up-ocrspace)
* [Setting up Tesseract OCR](#setting-up-tesseract-ocr)
* [Setting up extra OCR dependencies](#setting-up-extra-ocr-dependencies)
@@ -209,64 +210,69 @@ source material:
### Anime (and other video) text extraction
-Extracting chunks from video is supported through a provided script for the
-**[mpv]** video player.
+Extracting text chunks from video is by default supported through integration
+with the **[mpv]** video player. *Primary* video subtitles from mpv are treated
+by Kamite as incoming chunks. If *secondary* subtitles are present, they are
+treated as **chunk translations**.
-The Kamite mpv script can be found in the `extra/mpv` directory within the
-release package.
+To enable the connection between Kamite and mpv, the latter must be launched
+with the following exact value for the `input-ipc-server` parameter:
-> **Note:** The script requires either *D-Bus* (the `dbus-send` command must be
-globally available) or *curl* for communication with Kamite.
-
-To load the script into mpv, either: 1) copy it to the `scripts`
-subdirectory of the mpv configuration directory (on Linux usually
-`~/.config/mpv/scripts`) and launch mpv as usual, or 2) pass the script’s path
-as the value of the `--script` parameter when launching mpv.
+```sh
+mpv file.mkv --input-ipc-server="/tmp/kamite-mpvsocket"
+```
-> See also: [mpv reference: Script location][mpv-ref-script-location].
+Alternatively, the line
-The Kamite mpv script sends the *primary* video subtitles as chunks to Kamite.
-If *secondary* subtitles are present, it sends those as **chunk translations**.
+```sh
+input-ipc-server=/tmp/kamite-mpvsocket
+```
-To run mpv with an external subtitle file, use the `--sub-file` launch
-parameter. It can be repeated for multiple files. To assign subtitles as
-*primary* (assumed by Kamite to be in Japanese) and *secondary* (assumed to be a
-translation), respectively, use the `--sid` and `--secondary-sid` mpv launch
-parameters. Which subtitle ids to specify can be glanced by pressing
-F9 in mpv while the video file is open.
+can be put into the [mpv config file][mpv-ref-config].
-> See also: [mpv reference: Subtitle options][mpv-ref-sub-options].
+In the former case, mpv will only be open for communication with Kamite when
+launched with the specified parameter. In the latter—it will be open always.
-Beyond the above, Kamite offers further integration with mpv, currently
-amounting to displaying and controlling the play/pause status, as well as basic
-seeking. For this, the mpv JSON IPC communication mechanism is used.
+> For more on the communication mechanism used, see the
+[mpv reference for JSON IPC][mpv-ref-json-ipc].
-To make Kamite automatically connect to a running instance of mpv, the latter
-must be run with the exact following parameter:
+To run mpv with an external subtitle file, use the `--sub-file` launch
+parameter (it can be repeated for multiple subtitle files). To assign a given
+subtitle track as *primary* (assumed by Kamite to be the Japanese subtitles) and
+*secondary* (assumed to be the translations), respectively, use the `--sid` and
+`--secondary-sid` mpv launch parameters. Which subtitle IDs to specify can be
+glanced by pressing F9 in mpv while the video file is opened and the
+subtitles loaded.
-```sh
-mpv file.mkv --input-ipc-server="/tmp/kamite-mpvsocket"
-```
+Note that subtitles hidden within mpv will still be recognized by Kamite.
-> See also: [mpv reference: JSON IPC][mpv-ref-json-ipc]
+> See also: [mpv reference: Subtitle options][mpv-ref-sub-options].
Below is an excerpt from an example bash script used to quickly launch an anime
episode in mpv in such a way that it is immediately set up to work with Kamite.
```sh
-mpv /path/to/video/**$1*.mkv \ # Episode no. passed as an argument to the script
+mpv "/path/to/video/"*""*"E$1"*".mkv" \ # Episode no. passed as an argument to the script
--input-ipc-server="/tmp/kamite-mpvsocket" \
- --profile=jpsub \ # Custom profile that sets subtitle font and size, etc. See https://mpv.io/manual/stable/#profiles
--sub-file="/path/to/external/subtitles/$1.srt" \
--sid=2 \ # ID of the Japanese subtitles provided externally
--secondary-sid=1 \ # ID of the English subtitles embedded in the video file (to be used as translations)
--secondary-sub-visibility=no \
- --save-position-on-quit
+ --save-position-on-quit \
+ --profile=jpsub \ # An optional custom profile that can set a special subtitle font and size, etc. It must be defined separately in the mpv config file: see https://mpv.io/manual/stable/#profiles
```
-[mpv-ref-script-location]: https://mpv.io/manual/stable/#script-location
+Kamite can be useful even when viewing media without Japanese subtitles, for
+example as an area where heard words and phrases can be typed in and looked up.
+
+When viewing media with translated subtitles only, Kamite can be instructed to
+treat them as translations for unknown chunks and display them as such, by
+enabling “Translation-only mode” in the Settings tab or by launching with the
+config key `chunk.translationOnlyMode` set to `true`.
+
+[mpv-ref-config]: https://mpv.io/manual/stable/#configuration-files
[mpv-ref-json-ipc]: https://mpv.io/manual/stable/#json-ipc
[mpv-ref-sub-options]: https://mpv.io/manual/stable/#subtitles
@@ -351,158 +357,129 @@ Additional tips:
### Manga text extraction
-Kamite integrates with three alternative OCR (Optical Character Recognition)
+Kamite integrates with four alternative OCR (Optical Character Recognition)
providers to enable the extraction of text from manga pages displayed on screen.
The available OCR engines are:
-* [“Manga OCR”][manga-ocr]
-* [OCR.space]
-* [Tesseract OCR][tesseract]
+* [“Manga OCR”][manga-ocr] Online ([a Hugging Face Space by Gryan
+ Galario][manga-ocr-hf-gg])
+* [“Manga OCR”][manga-ocr] (Local)
+* [OCR.space] (Online)
+* [Tesseract OCR][tesseract] (Local)
-**“Manga OCR” is the recommended choice** as it gives superior results for manga
-and does not require sending data to a third party. However, compared with the
-other options, it is also storage- and resource-intensive as well as less simple
-to set up.
+**“Manga OCR” in either variant is the recommended choice** as it gives superior
+results for manga. The online version is extremely simple to set up, but
+requires sending screenshots of portions of your screen to a third party. The
+local version, on the other hand, requires a more involved setup and extra
+system resources.
[manga-ocr]: https://github.com/kha-white/manga-ocr
[tesseract]: https://github.com/tesseract-ocr/tesseract
-**By default, OCR is disabled.** To enable it, set the [config](#config) key
-`ocr.engine` to one of: `mangaocr`, `ocrspace`, or `tesseract` and go through:
-1\) the corresponding engine setup procedure, and 2\) the setup procedure for
-extra platform dependencies, both described below.
-
-* [Setting up “Manga OCR”](#setting-up-manga-ocr)
-* [Setting up OCR.space](#setting-up-ocrspace)
-* [Setting up Tesseract OCR](#setting-up-tesseract-ocr)
-
-* [Setting up extra OCR dependencies](#setting-up-extra-ocr-dependencies)
+**By default, OCR is disabled.** The necessary setup steps are:
-#### Setting up “Manga OCR”
+1. Set the [config](#config) key
+`ocr.engine` to one of: `mangaocr_online`, `mangaocr`, `ocrspace`, or
+`tesseract`.
-> Requires Python; version 3.10 (the latest) *is* supported.
+1. Set up the selected engine:
-**Note:** “Manga OCR” will use up to 2.5 GB of storage space. While initializing,
-it will use up to 1 GB of additional memory over what Kamite normally uses.
+ * [Setting up “Manga OCR” Online](#setting-up-manga-ocr-online)
+ * [Setting up “Manga OCR”](#setting-up-manga-ocr)
+ * [Setting up OCR.space](#setting-up-ocrspace)
+ * [Setting up Tesseract OCR](#setting-up-tesseract-ocr)
-##### Basic option: Global installation
+1. (Linux/Xorg and wlroots platforms only)
+[Set up extra OCR dependencies](#setting-up-extra-ocr-dependencies)
-> Note that this method will make it harder to reclaim *all* the disk space
-when uninstalling “Manga OCR”, although more than 90% of it could be reclaimed by
-simply running `pip3 uninstall manga-ocr torch` and cleaning the
-`~/.cache/huggingface/transformers/` directory.\
-> If you want to install “Manga OCR” with all its dependencies into a separate
-environment for an easy complete removal, see the *Advanced option* just below.
+#### Setting up “Manga OCR” Online
-1. Get the [pip] package installer and then run:
+> **Note:** The “Manga OCR” Online engine depends on a third-party online
+service ([a Hugging Face Space by Gryan Galario][manga-ocr-hf-gg]), so using it
+involves sending screenshots of portions of your screen to a third-party.
+Here is [the stated privacy policy of Hugging Face][huggingface-privacy-policy].
- ```sh
- pip3 install manga-ocr
- ```
-
-1. Run the program manually to verify that it works
-
- ```sh
- manga_ocr
- ```
-
- “Manga OCR” will now download its model. Wait for an output line such as
- `manga_ocr.ocr:__init__:29 - OCR ready`. Once it is displayed, “Manga OCR”
- is ready for use with Kamite. Ignore the error `NotImplementedError: Reading
- images from clipboard…`, as it is irrelevant for Kamite’s use of “Manga
- OCR”.
-
-[pip]: https://pip.pypa.io/en/stable/installation/
-
-##### Advanced option: Custom installation (Poetry)
+The online API used by the “Manga OCR” Online engine is freely accessible and
+consequently *does not* require any setup.
-Here is an example of how to manually install “Manga OCR” into its own [python
-virtual environment][python-venv]. This particular example will use the [Poetry
-dependency manager][python-poetry], but this is not the only way of achieving
-this result.
+Remember to [set up extra OCR dependencies](#setting-up-extra-ocr-dependencies)
+and to launch Kamite with the config key `ocr.engine` set to `mangaocr_online`.
-[python-venv]: https://docs.python.org/3/tutorial/venv.html
-[python-poetry]: https://python-poetry.org/docs/
+[huggingface-privacy-policy]: https://huggingface.co/privacy
-1. Clone the “Manga OCR” repository
+#### Setting up “Manga OCR” (Local)
- ```sh
- git clone "https://github.com/kha-white/manga-ocr.git"
- ```
+**Note:** “Manga OCR” will use up to 2.5 GB of disk space. During launch, it
+will use up to 1 GB of additional memory.
-1. Create a Poetry project
+##### Recommended option: installation using pipx
- ```sh
- cd manga-ocr
- poetry init -n
- ```
+1. Install [python][installing-python] and [pip]
-1. Register “Manga OCR”’s dependencies with the Poetry project
+1. Install [pipx] and run
```sh
- cat requirements.txt | xargs poetry add -vvv
+ pipx install manga-ocr
```
- The dependencies will be downloaded now. This could take some time.
+Kamite will now be able to use “Manga OCR”. On the first launch of Kamite with
+`ocr.engine` set to `mangaocr`, “Manga OCR” will take some time to download its
+model (around 450 MB). If there are issues, try running the `manga_ocr`
+executable installed by pipx and examining its output.
-1. Verify the installation
+###### Deinstallation
- While in the project directory, run:
+1. Run
```sh
- poetry run python -m manga_ocr
+ pipx uninstall manga-ocr
```
- “Manga OCR” will now download its model. Wait for an output line such as
- `manga_ocr.ocr:__init__:29 - OCR ready`. Once it is displayed, “Manga OCR”
- is ready for use with Kamite. Ignore the error `NotImplementedError: Reading
- images from clipboard…`, as it is irrelevant for Kamite’s use of “Manga
- OCR”.
+1. Delete the ~450 MB leftover model file in
+`~/.cache/huggingface/transformers/`.
-1. Tell Kamite how to launch “Manga OCR”
+###### Troubleshooting “pipx "Manga OCR" installation absent…”
- A launcher script must be created that: 1) prepares the Python environment
- containing the “Manga OCR” installation, and 2) inside that environment
- launches a “Manga OCR” wrapper script provided by Kamite. The launcher script
- must be named `mangaocr.sh` and placed directly in Kamite’s config directory
- (next to the `config.hocon` file). The following is an example of such
- script for a Poetry project:
+If pipx did not install to the default path expected by Kamite, you will have to
+specify the path manually in the [config file](#config):
- ```sh
- #!/usr/bin/env bash
- PROJECT_PATH="/path/to/cloned/manga-ocr/"
- cd $PROJECT_PATH || exit
- PYTHONPATH=$PYTHONPATH:$PROJECT_PATH poetry run python "$1"
- ```
-
-Remember to launch Kamite with the config key `ocr.engine` set to `mangaocr`.
-
-***
+```sh
+ocr {
+ mangaocr {
+ pythonPath = "/home//.local/pipx/venvs/manga-ocr/bin/python"
+ }
+}
+```
-To completely reclaim your disk space from “Manga OCR” in this scenario:
+The above path is the default, which you will need to modify according to the
+output you get from running
-1. Delete the Poetry project’s virtual environment
+```sh
+pipx list
+```
- While in the project directory, run:
+[installing-python]: https://realpython.com/installing-python/
+[pip]: https://pip.pypa.io/en/stable/installation/
+[pipx]: https://pypa.github.io/pipx/
- ```sh
- poetry env remove python
- ```
+##### Custom installation
-1. Delete the project itself
+If you install “Manga OCR” not through pipx, you will need to manually specify a
+path to a python executable (or a wrapper) that runs within an environment where
+the `manga_ocr` module is available. For example, if installed globally and the
+system Python executable is on `PATH` under the name `python`, then the
+appropriate configuration will be simply:
- ```sh
- cd ..
- rm -rf manga-ocr
- ```
-
-1. Clear Poetry package cache
-
- ```sh
- poetry cache clear pypi --all
- ```
+```sh
+ocr {
+ mangaocr {
+ pythonPath = python
+ }
+}
+```
-1. Find the ~450 MB file in `~/.cache/huggingface/transformers/` and delete it
+**Deinstallation note**: There will be a ~450 MB leftover model file in
+`~/.cache/huggingface/transformers/`.
#### Setting up OCR.space
@@ -550,7 +527,12 @@ Remember to launch Kamite with the config key `ocr.engine` set to `ocrspace`.
`tessdata` directory (usually `/usr/[local/]share/tessdata/` or
`/usr/share/tesseract-ocr//tessdata`).
-Remember to launch Kamite with the config key `ocr.engine` set to `tesseract`.
+By default, Tesseract is expected to be available on `PATH` under the executable
+name `tesseract`. If this is not the case, the [config](#config) key
+`ocr.tesseract.path` needs to be set to the executable’s path.
+
+Once the setup is completed, you can launch Kamite with the config key
+`ocr.engine` set to `tesseract`.
#### Setting up extra OCR dependencies
@@ -571,6 +553,9 @@ tasks. You need to install them on your own.
Used for selecting a screen region or point.
grim
Used for taking screenshots for OCR.
+ wlrctl
+ (Optional) Used to trigger a mouse click for OCR Auto Block Instant
+ mode.
#### OCR usage
@@ -613,7 +598,7 @@ block.
Select a point within a block of text; Kamite will try to infer the extent of
the block and then OCR the resulting area.
-*This should be good enough for 90% of typical manga text blocks, but the
+*This should be good enough for > 90% of typical manga text blocks, but the
block detection algorithm has a lot of room for improvement.*
**Note for Linux/Xorg users:** On Xorg, the point selection mechanism cannot be
@@ -864,8 +849,8 @@ screen corners but also edges)
direct lookup with pop-up dictionaries.
Mangareader
An in-browser manga reader with built-in support for OCR-ing selected
- regions using an online API backed by “Manga OCR”.
-
+ regions using an online API backed by “Manga OCR”. Can be used in tandem with
+ Kamite with the help of the Clipboard Inserter browser extension.
Cloe
OCRs a screen selection to clipboard using “Manga OCR”.
Poricom
@@ -942,10 +927,9 @@ Textractor for games. Some other alternatives are:
Text can be pasted from clipboard by pressing Ctrl + V in
Kamite’s browser tab.
-
+The Kamite browser client can automatically pick up clipboard text with the
+Clipboard Inserter browser extension ([Firefox][clipboard-inserter-ff],
+[Chrome][clipboard-inserter-chrome]) (assumes default extension settings).
[clipboard-inserter-ff]: https://addons.mozilla.org/en-US/firefox/addon/clipboard-inserter/
[clipboard-inserter-chrome]: https://chrome.google.com/webstore/detail/clipboard-inserter/deahejllghicakhplliloeheabddjajm
@@ -1203,7 +1187,7 @@ Seek +1 seconds.
#### Linux/Xorg
-> Note: The following does not work in Linux/wlroots.
+> Note: The following does not work on Linux/wlroots.
Below is an excerpt from a [config file](#config) illustrating how to set up
global keyboard shortcuts and what actions are available for binding.
@@ -1213,7 +1197,8 @@ keybindings {
global {
ocr {
manualBlock = …
- autoBlock = …
+ autoBlock = … # Instant detection under mouse cursor
+ autoBlockSelect = … # Must click to select a point
}
}
}
@@ -1308,6 +1293,11 @@ chunk {
# Whether to flash backgrounds of chunk texts in the client's interface on
# certain occasions
flash = true
+
+ # Whether to treat incoming chunks as translations and create a new chunk for
+ # each translation. Useful when watching media with just the translation
+ # subtitles
+ translationOnlyMode = false
}
commands {
@@ -1348,7 +1338,8 @@ keybindings {
# A key combination to assign to the command. See the "Keyboard shortcuts"
# section of the Readme for the format specification.
manualBlock = …
- autoBlock = …
+ autoBlock = … # Instant detection under mouse cursor
+ autoBlockSelect = … # Must click to select a point
}
}
}
@@ -1373,12 +1364,25 @@ lookup {
}
ocr {
- # Which OCR engine to use: none, tesseract, mangaocr
+ # The OCR engine to use: none, tesseract, mangaocr, mangaocr_online, ocrspace
engine = none
# (Directory path) Watch the specified directory for new/modified images and
# OCR them automatically
watchDir = …
+ tesseract {
+ # (File path) The path to Tesseract’s executable
+ path = "tesseract"
+ }
+
+ mangaocr {
+ # (File path) A path to a python executable that provides access to the
+ # `manga_ocr` module. If absent, a system-dependent default is used which
+ # assumes that manga-ocr was installed through pipx into the default
+ # location
+ pythonPath = …
+ }
+
# A *list* of OCR regions, for each of which a region recognition command
# button will be displayed in the command palette. See the "OCR region"
# section of the Readme for details
@@ -1532,7 +1536,7 @@ Available commands are distinguished by *command kind*, which is made up of two
segments: *command group* and *command name*. For example, kind `ocr_region`
corresponds to the group `ocr` and the name `region`.
-Commands have zero or more required parameters.
+The command parameters are required unless a default value is specified.
### Sending commands
@@ -1568,16 +1572,20 @@ block, Kamite OCRs the area as is. For Tesseract, the *vertical* text model is
used by default.
**`manual-block-vertical`**\
-(Tesseract only) Like `manual-block`, but explicitly uses the vertical text model.
+(Tesseract only) Like `manual-block`, but explicitly uses the vertical text
+model.
**`manual-block-horizontal`**\
-(Tesseract only) Like `manual-block`, but explicitly uses the horizontal text model.
+(Tesseract only) Like `manual-block`, but explicitly uses the horizontal text
+model.
-**`auto-block`**\
-User is prompted to select a screen point within a source text block, Kamite
-attempts to infer the extent of the block and OCRs the resulting area.
+**`auto-block`** `(mode: ["select" | "instant"] = "instant")`\
+Kamite assumes the mouse cursor is inside a source text block, attempts to infer
+the extent of the block, and OCRs the resulting area. The `mode` parameter
+specifies whether to prompt the user to click a point or to instantly take the
+current cursor position.
-**`region`** `(x, y, width, height: number, autoNarrow: boolean)`\
+**`region`** `(x, y, width, height: number; autoNarrow: bool)`\
Kamite OCRs the provided screen area either as is (if `autoNarrow` is `false`), or
after applying an algorithm designed to narrow the area to just text (if `autoNarrow`
is `true`). **Note:** This is an experimental future, it might function poorly
@@ -1632,8 +1640,12 @@ Kamite never saves your data to disk.
Kamite never sends your data through the network, with the following exceptions:
-* When `ocr.engine` is set to `ocrspace`, screenshots of portions of the user’s
- screen are sent to [OCR.space] for text recognition.
+* When `ocr.engine` is set to `mangaocr_online`, screenshots of portions of your
+ screen are sent to a [Hugging Face Space][manga-ocr-hf-gg] for text
+ recognition.
+
+* When `ocr.engine` is set to `ocrspace`, screenshots of portions of your screen
+ are sent to [OCR.space] for text recognition.
## Development
@@ -1789,4 +1801,5 @@ the original license notices.
[Yomichan]: https://foosoft.net/projects/yomichan/
[Gomics-v]: https://github.com/fauu/gomicsv
[Sway]: https://swaywm.org/
+[manga-ocr-hf-gg]: https://huggingface.co/spaces/gryan-galario/manga-ocr-demo
[OCR.space]: https://ocr.space/