Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/url.py: Strengthen URL escaping. #85

Merged
merged 1 commit into from
Jul 15, 2022

Conversation

refeed
Copy link
Member

@refeed refeed commented Jul 6, 2022

Use the same implementation of how zulip sanitizes its stream name and
URL into a safe URL.

Closes #35

@refeed refeed force-pushed the strenghten_url_escaping branch from 1a39756 to 94ad1ce Compare July 6, 2022 07:38
@refeed
Copy link
Member Author

refeed commented Jul 6, 2022

If anyone asked, I chose the A- prefix because A for "archive" well it is arbitrary though can be anything but dot char

@refeed
Copy link
Member Author

refeed commented Jul 6, 2022

I think this may also closes #51

@rht
Copy link
Contributor

rht commented Jul 6, 2022

Having a different URL scheme from the way the webapp does it would complicate the mapping from the URLS from logged out view to the ones in this PR.

Since we are not using Jekyll anymore, I think we shouldn't add a case for it either.
The Lean community probably still uses an older version with Jekyll (see https://leanprover-community.github.io/archive/), but I think at this point, it was mainly for the nice UI theming. I'd assume they'd happily switch to the current zulip-archive if it has nice UI.

@refeed refeed force-pushed the strenghten_url_escaping branch 2 times, most recently from 8b57ace to 347a1fe Compare July 6, 2022 08:46
@refeed
Copy link
Member Author

refeed commented Jul 6, 2022

I see, just updated the PR again!

lib/url.py Show resolved Hide resolved
@refeed refeed force-pushed the strenghten_url_escaping branch 3 times, most recently from c9e9a34 to f72db9d Compare July 7, 2022 04:49
return (
urllib.parse.quote(topic_name, safe="~()*!.'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "~()*!.'" seems to be arbitrary. It's from this commit: f8bea8a

@rht
Copy link
Contributor

rht commented Jul 7, 2022

LGTM. @timabbott FYI

@refeed refeed force-pushed the strenghten_url_escaping branch from f72db9d to 1ca7b36 Compare July 14, 2022 07:46
Use the same implementation of how zulip sanitizes its stream name and
URL into a safe URL.

Closes zulip#35
@refeed refeed force-pushed the strenghten_url_escaping branch from 1ca7b36 to 0d03636 Compare July 14, 2022 07:51
@timabbott timabbott merged commit 78da21b into zulip:master Jul 15, 2022
@timabbott
Copy link
Member

Merged, thanks @refeed!

@eric-wieser
Copy link
Contributor

eric-wieser commented Jan 16, 2023

Isn't this going to break any links into existing archives that use the old encoding scheme? It certainly breaks python archive.py -i, as the old json cache uses the old style filename escaping.

I hope Zulip is ok with me re-downloading the entire history of https://leanprover.zulipchat.com/ with the new URL scheme!

@rht
Copy link
Contributor

rht commented Jan 16, 2023

That is indeed a problem. The Isabelle Zulip had reported this in the past: https://chat.zulip.org/#narrow/stream/127-integrations/topic/zulip-archive.3A.20FileNotFoundError.

@eric-wieser
Copy link
Contributor

eric-wieser commented Jan 16, 2023

Generating redirect HTML pages at all the old URLs would be a reasonable compromise; having to rebuild the json cache is only annoying once, but broken links to the archive are annoying indefinitely.

@rht
Copy link
Contributor

rht commented Jan 17, 2023

What if there is a new topic URL that happens to collide with an old URL belonging to a different topic? Or maybe there are not going to be any collisions at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Strengthen URL escaping
4 participants