-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Belgian ID Cards: Issue with Document Numbers Exceeding 9 Characters in TD1 #4
Comments
Hi @5an1ty ! It seems that Belgium has 3 types of ID Cards: The first one complies with ICAO specs (Belgian Citizens). The other two (Kids and Foreigners) work as you have explained, so It seems that Belgium is "twisting" the ICAO specifications. According ICAO 9303-5 (TD1) 4.2.2.1, MRZ chars position will be 6 to 14 in line 1 for document number and 15 for document number hash, so this case is outside the scope of mrz. However, these problems can usually be solved simply by overwriting some property (see issue #3). In this case, overwriting document_number property. A possible solution could be: #!/usr/bin/python3
# -*- coding: utf-8 -*-
from mrz.checker.td1 import *
from mrz.base.functions import hash_is_ok
class TD1BELCodeChecker(TD1CodeChecker):
@property
def document_number_hash(self) -> bool:
"""Return True if the hash of the document number is validated, False otherwise."""
if self._document_number_hash == "<":
doc_number_fin = self._optional_data.rstrip("<")
self._document_number = self._document_number + "<" + doc_number_fin[:-1]
self._document_number_hash = doc_number_fin[-1]
return self._report("document number hash", hash_is_ok(self._document_number, self._document_number_hash)) Usage: from mrz.checker.td1_belgian import TD1BELCodeChecker
# CASE 1
code_citizens = ("IDBEL590330101085020100200<<<<\n"
"8502016F0901015BEL<<<<<<<<<<<8\n"
"VAN<DER<VELDEN<<GREET<HILDE<<<")
# CASE 2
mrz_code_kids = ("IDBEL000610035<7017<<<<<<<<<<<\n"
"0002015F0910190BEL000201002003\n"
"MAES<<SOPHIE<ANN<G<<<<<<<<<<<<")
td1_check_citz = TD1BELCodeChecker(code_citizens)
print("CASE 1:%s" % td1_check_citz)
td1_check_kids = TD1BELCodeChecker(mrz_code_kids)
print("CASE 2:%s" % td1_check_kids)
# CASE 3: Let's change document number hash
mrz_code_kids = ("IDBEL000610035<7010<<<<<<<<<<<\n"
"0002015F0910190BEL000201002003\n"
"MAES<<SOPHIE<ANN<G<<<<<<<<<<<<")
td1_check_kids = TD1BELCodeChecker(mrz_code_kids)
print("CASE 3:%s" % td1_check_kids)
print("FALSES CASE 3:")
print(td1_check_kids.report_falses) Output:
This solution is valid for the 3 types of Belgian ID Cards. It's a very quick solution, so, I'm sure it can be improved. For example, if you want to report children and foreigners id cards as a warning: @property
def document_number_hash1(self) -> bool:
"""Return True if the hash of the document number is validated, False otherwise."""
ok = True
if self._document_number_hash == "<":
doc_number_fin = self._optional_data.rstrip("<")
self._document_number = self._document_number + "<" + doc_number_fin[:-1]
self._document_number_hash = doc_number_fin[-1]
self._report("Possible Kids or Foreigners ID Card", kind=1)
ok = not self._compute_warnings
return self._report("document number hash",
ok and hash_is_ok(self._document_number, self._document_number_hash)) Output:
I hope I've helped. Regards. PS: I'm thinking that maybe it could be a good idea to create a folder to store all these special cases outside of ICAO specs |
Hi, thank you for replying! It helps me a lot! However your explanation is not fully correct. I have verified 3 full Belgian eID cards (not kids or foreigners) and they also don't follow the actual ICAO 9303-5 (TD1) 4.2.2.1 spec. They have the same exception as the kids and foreigners cards like you describe above. I guess it's mostly newer cards that have a high enough document number. It would be nice indeed to also support special cases and have them in another folder. By the way: |
Hi again! I understand.. I'm from Spain and we also have 2 types of cards. In the old cards the national identification number is the document_number field, in the new cards that number is assigned to optional_data field and the document_number field is occupied by the number of the physical support of the cards (a real mess!) I dont know if it's what you're looking for, but the library has several methods to report the result. For example, continuing with the previous example: # CASE 3: Let's change document number hash
mrz_code = ("IDBEL000610035<7010<<<<<<<<<<<\n"
"0002015F0910190BEL000201002003\n"
"MAES<<SOPHIE<ANN<G<<<<<<<<<<<<")
td1_check = TD1BELCodeChecker(mrz_code)
print("CASE 3:%s" % td1_check)
print("\nList of tuples with all the fields analyzed:")
print(td1_check.report)
if bool(td1_check) == False:
print("\nList of tuples (same as above but only returns Falses):")
print(td1_check.report_falses)
print("\nList with errors:") # I've never liked it (it's possible that I can change or eliminate it)
print(td1_check.report_errors)
print("\nList with warnings:") # same as above
print(td1_check.report_warnings)
for field, result in td1_check.report:
print(field.title().ljust(30, "."), result) Output:
|
Hi, whats up! This issue was solved with a "special case". It´s possible to check Belgian id cards with this class, but I think there is nothing to generate its mrz code. I'm very busy right now. However let me re-study this issue again and when I have a little free time I will try to find a solution. BR |
Advance:Hi again @imanenter Although the problem is not solved, i know how Belgian ID card 'mechanism' works. Taking your picture and two from above: from mrz.generator.td1 import TD1CodeGenerator
# 000590448 301
print(TD1CodeGenerator("ID", # Document type
"Belgium", # Country
"000590448", # Document number
"850101", # Birth date
"F", # Genre
"170203", # Expiry date
"Belgium", # Nationality
"Le Meunier", # Surname
"Jennifer Anne", # Given name(s)
"3016", # Optional data 1
"85010100200")) # Optional data 2
# 000610035 7017
print(TD1CodeGenerator("ID", # Document type
"Belgium", # Country
"000610035", # Document number
"000201", # Birth date
"F", # Genre
"091019", # Expiry date
"Belgium", # Nationality
"Maes", # Surname
"Sophie Ann G", # Given name(s)
"7017", # Optional data 1
"00020100200")) # Optional data 2
# B10032650 08
print(TD1CodeGenerator("ID", # Document type
"BEL", # Country
"B10032650", # Document number
"821020", # Birth date
"F", # Genre
"060131", # Expiry date
"New Zealand", # Nationality
"Flores", # Surname
"Gema Caroline J", # Given name(s)
"08", # Optional data 1
"82102008472")) # Optional data 2 I got this output
The result is (almost) correct: As you can see, it has only been necessary set It would only be necessary to disable All of this takes a long time. In another free time i will continue working with it Regards |
hi Arg0s1080 |
Hi again @imanenter Another weekend 😊... There are many ways to solve the problem. I chose the way that I think is most correct. There is still "polishing" some small detail to finish, but it is functional. Draft:Taking your picture and two from above: from mrz.special_cases.generator.belgium_id_card import TD1BELCodeGenerator
# 000590448 301
print(TD1BELCodeGenerator("ID", # Document type
"Belgium", # Country
"000590448 301", # Document number
"850101", # Birth date
"F", # Genre
"170203", # Expiry date
"Belgium", # Nationality
"Le Meunier", # Surname
"Jennifer Anne", # Given name(s)
"", # Optional data 1: This field is null. I still have to think what to do with it
"85010100200")) # Optional data 2
print()
# 000610035 701 7
print(TD1BELCodeGenerator("ID", # Document type
"Belgium", # Country
"000610035 701", # Document number
"000201", # Birth date
"F", # Genre
"091019", # Expiry date
"Belgium", # Nationality
"Maes", # Surname
"Sophie Ann G", # Given name(s)
"blahblah", # Optional data 1. Canceled
"00020100200")) # Optional data 2
print()
# B10032650 0 8
print(TD1BELCodeGenerator("ID", # Document type
"BEL", # Country
"B100326500", # Document number
"821020", # Birth date
"F", # Genre
"060131", # Expiry date
"New Zealand", # Nationality
"Flores", # Surname
"Gema Caroline J", # Given name(s)
"", # Optional data 1. CANCELLED
"82102008472")) # Optional data 2 Output:
Result is (totally) correct As you can see above,
The hash is calculated automatically. I also want to include the ability to add the hash manually:
BR |
wooooowww thank u vvvveeeeery much, its cool i really appreciate it mannn |
@Arg0s1080 Is there any other country that does not follow the TD1 format rather than Belgium..?? |
Hi there, @vamshi-7 The problem with TD1 format is that are used by countries as national Id cards, driver's licenses or other non-international documents, so, it's very probale that there are many countries that do not strictly comply with ICAO specs.That's why there are usually fewer problems with passports and visas. Someone long ago reported a problem with German id cards and a special case was created, but surely Belgium and Germany are not the only countries that "break or twist" specs. Why you ask? BR |
Hey @Arg0s1080 , Firstly, thank you for the reply. I am student from uni-koblenz, currently working as an intern. My research is on to extract the text from the travel docs. As far as now from my limited experience, all TD3 type docs are maintaining proper specs except Germany. I am confused with TD1 type after seeing this belgium cards. But, many other countries can only break or twist the first line specs in the MRZ region? As I see, apart from Germany many other country are not twisting the specs w.r.t the second and third lines. Please correct me if am wrong. Moreover, apart from google, any other open-sources to obtain this images dataset. BR |
Hi again, I'm glad and I hope everything goes well for you!! In reality problems should not exist. The specs are unobjectionable (strict enough and flexible enough). Problems usually appear when "national data" is moved to I highly doubt that you will find a good dataset to train a neural network or massively test a project. Think that it is private and very sensitive data (that's why this project has been in beta for years). I know there have been students who have used mrz.generator to train a NN, so I guess they didn't find a better option. Why do you say that Germany does maintaning proper ICAO specs? Is it because of its country code ("D": only one letter) or another reason? BR |
Hi, yes, It's difficult to find the data even to test the algorithm, especially for Belgium cards. Thanks and BR. |
Given these two TD1 MRZ values:
And then another one,
When scanned via OCR, it can read either |
Acccoding to lat edition of ICAO 9303, in Part 5, there is an explanation in how to compute the DV when the document number exceeds the original field size: |
Hi there, thank you for making this library!
I have an issue with TD1, specifically scanning Belgian ID cards. If the document_number_hash digit is "<" the document will not verify.
I have checked this with 3 different Belgian ID cards and they all have "<" on index 14 of line 0.
After a ton of googling and reading specs I found an issue with the way you check document_number_hash...
Normally a document number starts at position 5 and ends at position 13 but sometimes a document number exceeds the size of it's slot and optional fields will be used, let's take a look at this example:
IDBEL123456789<1233<<<<<<<<<<<
In this case the document number check is < when we have a scenario like that we need to look at the optional numbers (1233). So when the document number check is < we need to look at the last none empty value: 3. This is the actual hash number. After that we simply verify the hash of:
Document Number: 123456789<123
Hash: 3
And this should verify as True using your verify function.
The text was updated successfully, but these errors were encountered: