Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correct ratio determination for noise estimation #53

Open
wants to merge 11 commits into
base: pdf-metadata-tooling
Choose a base branch
from

Conversation

rmast
Copy link

@rmast rmast commented Jun 25, 2022

I solved issue #52 myself.

@MerlijnWajer
Copy link
Collaborator

Thanks -- I will review this tonight or tomorrow at latest, I'm mostly on the road today.

@rmast
Copy link
Author

rmast commented Jun 26, 2022

The second commit is for solving this error:
#55 (comment)

@MerlijnWajer
Copy link
Collaborator

btw, I think I fixed this in 3c20a46 - can you confirm?

@rmast
Copy link
Author

rmast commented Nov 30, 2022

btw, I think I fixed this in 3c20a46 - can you confirm?

Without resetting up and retesting it I read through the issues to see what we were trying to solve.
In the text of #52, namely #52 (comment), I read some inline patch of mrc.py on the inversion that I don't see reflected. So I can imagine not all inversion is handled correctly.

The issue with the double text (Array) is caused by a segmentation bug in Tesseract which I've tried to crack during my summer holiday. However there's too little testing capacity and core-knowledge at Tesseract to allow core-changes to repair this segmentation, which caused the superior EasyOCR-segmentation to emerge.

At the end of my summer holiday this year I tried to get a complete new inversion based on the segmentation of EasyOCR and an algorithm to compare the inner color and the outer color of those found segments for the inversion choice. I unfortunately didn't have the time to mold it into a working product.

@rmast
Copy link
Author

rmast commented Dec 30, 2022

This Christmas Holiday my attention is distracted by new AI programming capabilities of OpenAI Codex, rolling on the ChatGPT-hype. As I'm really bad at Cython programming I'm trying to let Codex make consistent/improve my code for a new context sensitive inverter. I wonder whether there is an other approach for interpreting and segmenting documents at a more intelligent level: https://x-decoder-vl.github.io/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants