Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant Performance Improvement in Markdown Conversion Process #95

Open
lucas-almeida026 opened this issue Jul 16, 2023 · 4 comments

Comments

@lucas-almeida026
Copy link

Hi Souvik Kar Mahapatra and contributors,

Firstly, I'd like to thank you for your work on the notion-to-md library. It's been incredibly useful for personal and professional projects.

While using your library, I noticed some opportunities for performance optimizations and implemented a different architecture that led to significant improvements in the markdown conversion process. However, this new architecture diverges substantially from the original and does not support all the features of the current version, such as custom transformers.

Despite these differences, I wanted to share the work with you because I believe some elements could potentially benefit the notion-to-md library and its users.

The new approach involves three main steps:

  1. We request all blocks from a page in a single operation, storing them in a 2D array (layers of blocks).
  2. Reconstruct the block tree using parent and blockId properties.
  3. Finally, traverse the tree recursively and covert all blocks into a valid markdown string. - gist with the code here

I did some local tests to measure the improvement, I'm aware that this kind of testing might not be precise enough and can be biased, however I believe it can serve as a starting point.

In this table you can see some key statistics from these tests, the overall result is approximate 40% faster execution time for the entire process described in previous steps.

I understand that the approach taken is probably not suitable for direct integration into the library due to its architectural differences and reduced feature set. However, I thought that sharing it may result in more improvements for the future.

Please let me know if you're interested in exploring these improvements further. I would be happy to share more details or collaborate in integrating some of these optimizations into the library.

Best,
Lucas Almeida

@souvikinator
Copy link
Owner

Hi Lucas,
Thanks for the effort and contribution, really appreciate it. I'll go through the code and then I guess we can discuss on what can be done and how can we make the features compatible with the new changes.

@souvikinator
Copy link
Owner

@lucas-almeida026 apologies for the super late response. I'm working on the version 4 and it focuses on fixing this issue specifically, would like to have your feedback on it.
here is the discussion link: #112

@lucas-almeida026
Copy link
Author

@lucas-almeida026 apologies for the super late response. I'm working on the version 4 and it focuses on fixing this issue specifically, would like to have your feedback on it. here is the discussion link: #112

@souvikinator no worries, of course I'd like to contribute, I'll take a look as soon as I can

@souvikinator
Copy link
Owner

Thanks, appreciate it :)
Looking forward to work on v4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants