Roadmap https://github.com/orgs/cbdb-project/projects/3/views/2?pane=issue&itemId=72264379 LAST ISSUE!!!! For biographical addresses, remove 府 州 縣 路 for len(str) >2 1 Extract the Han persons https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/1%20Extract%20the%20Han%20persons/extract_the_han_persons.ipynb 2 Compare with the CBDB data https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/2%20Compare%20with%20the%20CBDB%20data/compare_with_cbdb_data.ipynb 3 Keep the essential columns for new persons https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/3%20Keep%20the%20essential%20columns%20for%20new%20persons/keep-the-essential-columns.ipynb 4 Clean the variances in person names Finding the variances https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/4%20Clean%20the%20variences%20in%20person%20names/clean-the-variences-in-person-names.ipynb Converting the variances https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/4%20Clean%20the%20variences%20in%20person%20names/step-two-apply-the-instruction-to_data.ipynb 5 Create the data for coding create-data-for-addr-coding https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/5%20Create%20the%20data%20for%20coding/create-data-for-addr-coding.ipynb create-data-for-office-coding https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/5%20Create%20the%20data%20for%20coding/create-data-for-office-coding-1-extract-info.ipynb https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/5%20Create%20the%20data%20for%20coding/create-data-for-office-coding-2-create-data.ipynb 6 Extract the cooked data https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/6%20Extract%20the%20cooked%20data/assign-labels-to-data.ipynb 7 Create CBDB data https://github.com/cbdb-project/Jinshenlu-20240725-Han-persons-not-in-CBDB_Hongsu/blob/main/7%20Create%20CBDB%20data/create-cbdb-data.ipynb