Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid URL error when domain contains multiple slashes at the end #12722

Open
1 task
screenfluent opened this issue Dec 11, 2024 · 10 comments · May be fixed by #12733
Open
1 task

Invalid URL error when domain contains multiple slashes at the end #12722

screenfluent opened this issue Dec 11, 2024 · 10 comments · May be fixed by #12733
Labels
- P3: minor bug An edge case that only affects very specific usage (priority) feat: dev Related to `astro dev` CLI command (scope) feat: routing Related to Astro routing (scope) needs response Issue needs response from OP

Comments

@screenfluent
Copy link

screenfluent commented Dec 11, 2024

Astro Info

Astro                    v5.0.5
Node                     v18.20.3
System                   Linux (x64)
Package Manager          unknown
Output                   static
Adapter                  none
Integrations             none

If this issue only occurs in one browser, which browser is a problem?

No response

Describe the Bug

Astro throws 500 Internal Server Error with "Invalid URL" when main domain contains double or more slashes at the end. This happens in development server, production build, StackBlitz. This is not a browser issue. The problem occurs at server level and can be reproduced with curl.

What's the expected result?

  • Redirect to normalized version

Instead the server crashes with 500 error.

Link to Minimal Reproducible Example

https://stackblitz.com/edit/github-726xsb6l-xoqxvvkc

Participation

  • I am willing to submit a pull request for this issue.
@github-actions github-actions bot added the needs triage Issue needs to be triaged label Dec 11, 2024
@screenfluent screenfluent changed the title Invalid URL error when pathname contains multiple slashes // Invalid URL error when domain contains multiple slashes at the end Dec 11, 2024
@ascorbic
Copy link
Contributor

I can only reproduce this in one scenario: when running astro dev, and requesting http://localhost:1234//. I can see the place where this is failing. I can't reproduce it for other scenarios though: it works if there's anything else in the path (e.g. http://localhost:1234/something//) and I can't reproduce it in a production build. Do you have reproductions for the other cases?

@ascorbic ascorbic added feat: dev Related to `astro dev` CLI command (scope) feat: routing Related to Astro routing (scope) - P3: minor bug An edge case that only affects very specific usage (priority) needs response Issue needs response from OP and removed needs triage Issue needs to be triaged labels Dec 12, 2024
@screenfluent
Copy link
Author

I discovered it when deployed to Cloudflare Pages and accidentally put two // at the end and to my surprise, the page loaded keeping them. So I added more and they stayed too. I spent quite a good amount of time fighting with CF deployment and ultimately did localhost testing to see the invalid URL error.

I would love to see Astro paying attention to slashes, as someone who built their first website in 1998 and cut their teeth on SEO. I know that to this day (crazy!) Google can't resolve the issue with slashes at the end of URLs and they're treated as two different competing addresses that harm SEO. And users link in various surprising and often incorrect ways.

@ascorbic
Copy link
Contributor

So to be clear, there are two issues? Locally, there's the invalid URL problem, and on Cloudflare it's that multiple slashes are ignored and treated the same as a single slash?

@screenfluent
Copy link
Author

Hmmm I'm not sure if it's one issue that behaves differently depending on the environment or two separate ones. I'll deploy basic clean version to CF again to see if the behavior is the same.

@ascorbic
Copy link
Contributor

The code with the invalid URL is in the dev server code, so if it also happens elsewhere then it's two issues.

@ascorbic ascorbic linked a pull request Dec 13, 2024 that will close this issue
@screenfluent
Copy link
Author

screenfluent commented Dec 14, 2024

After extensive testing I can confirm this is a framework-level issue that needs attention. I've tested various scenarios with comparison to Next.js and here's what I found:

  1. Next.js (proper handling)
    $ curl -I -L https://nextjs.org////
    HTTP/2 308
    location: /
    ... redirects to clean URL

  2. Astro (problematic handling)
    $ curl -I -L https://astro.build////
    HTTP/2 200
    ... serves content directly without normalization

I've tried various approaches to fix this on Cloudflare Pages:

  • Using trailingSlash: 'never' in astro.config.mjs
  • Adding middleware to handle redirects
  • Using _redirects file with various rules
  • Testing both static and SSR modes with Cloudflare adapter

None of these approaches properly handled multiple slashes.

Test cases that should all redirect to their canonical form (currently Astro preserves all these variations):

domain.com//
domain.com///
domain.com/path//
domain.com///path
domain.com/path///path

After more testing I found that the issue occurs in two specific cases:

Case 1: Multiple slashes at root path
$ curl -I -L https://astro.build/////
HTTP/2 200
... serves content directly without normalization

Case 2: Multiple slashes before subpath
$ curl -I -L https://astro.build////blog/
HTTP/2 200
... also serves without normalization

Case 3: Multiple slashes after subpath (works correctly)
$ curl -I -L https://astro.build/blog///
HTTP/2 301
location: /blog/
... correctly normalizes path

Compare with Next.js (always normalizes regardless of pattern)
$ curl -I -L https://nextjs.org/////
$ curl -I -L https://nextjs.org////blog/
$ curl -I -L https://nextjs.org///blog///
$ curl -I -L https://nextjs.org/////blog/////
HTTP/2 308
location: /blog
... always normalizes to canonical form

For a framework that puts such emphasis on Core Web Vitals and SEO I believe this should be handled at the framework level similar to how Next.js does it. Let me know if you need any help with testing different approaches to fix this. Thanks!

@screenfluent
Copy link
Author

Is there anything else needed @ascorbic?

@ematipico
Copy link
Member

@screenfluent there's a PR open already, we're essentially looking for a way to fix the issue. Please bear with us

@ascorbic
Copy link
Contributor

This is something that's probably going to need to be implemented in an adapter-by-adapter way, because the built-in redirects don't support the syntax to implement this. I have a PR to handle it in dev, but we'll need to probably do separate fixes for Cloudflare etc.

@screenfluent
Copy link
Author

Many thanks guys! I've found so much joy since I started learning Astro recently. It literally transported me 20+ years back to when I was building HTML sites. Just when everything was looking great with 5.0, I discovered one of the worst technical SEO nightmare. And since some time past was worried that nothing will happen. Anyway, I hope you'll figure it out. For the dev server, it doesn't matter to me at all. I just learned the hard way that Google can be unforgiving when links I have no control over got some typos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- P3: minor bug An edge case that only affects very specific usage (priority) feat: dev Related to `astro dev` CLI command (scope) feat: routing Related to Astro routing (scope) needs response Issue needs response from OP
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants