-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle case mismatches when looking up env vars in the Config snapshot #11824
Conversation
…s when var is not in self.env
…unicode uppercase instead of ascii
r? @weihanglo (rustbot has picked a reviewer for you, use r? to override) |
r? @ehuss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does get_env_str()
and env_has_key()
also need to be updated?
src/cargo/util/config/mod.rs
Outdated
// Only keep entries where both the key and value are valid UTF-8 | ||
.filter_map(|(k, v)| Some((k.to_str()?, v.to_str()?))) | ||
.map(|(k, _)| (k.to_uppercase().replace("-", "_"), k.to_owned())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say why these lines changed? I would not expect any changes to normalized_env
.
Though if you want to clean this up, it seems like it would be nicer to use .keys()
instead of .iter()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that, I meant to add a comment about this but forgot. I realized when I was looking back at #11727 that we might have accidentally subtly changed the behavior of normalized_env
there.
Before #11727, the normalized_env
(called upper_case_env
there) was generated from self.env
, which held pairs where both the key and value were required to be valid UTF-8. In #11727, upper_case_env
started using all the keys that were valid UTF-8 (regardless of whether the value was also UTF-8), so in principle more keys might potentially be included.
Since I noticed it here, I put back the check that both the env key and env var are valid UTF-8 before using the key in normalized_env
. I don't really have a sense of whether this small change is important. If you think that it's probably not, then I'd be glad to use .keys()
instead, which I'd agree is more readable.
I think they don't need to be updated -- here's my reasoning: Those two methods were added for convenience when On the other hand, adding Hopefully this makes sense, please let me know if anything is unclear or if I'm thinking about this wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me!
Does it make sense that we make three env hashmaps (env
, case_insensitive_env
, and normalized_env
) into a single struct, say Envs
? The struct only exposes a limited API for Config
, and all access to env must get through it.
That will be clearer what is accessible and what is not. We can also put that struct under its own module (config/mod.rs
has grown too big 😆).
Sure! I agree that would be clearer, I'll give that a shot now. |
Hi @weihanglo, I've implemented your suggestion in the most recent commit. Please take a look when you have a chance and let me know if you have any suggestions 😃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thank you for extracting them as a struct! For other issues let's wait for Eric's feedback :)
Sounds great, thanks for taking a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through this again. Seems good to merge.
Regarding UTF-8 validity of env value #11824 (comment), we just go back in the day before #11727 so should be no problem at all I guess. We can lift the restriction if needed after.
@ehuss, do you have any other thought on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I like wrapping things in the Env
struct.
src/cargo/util/config/environment.rs
Outdated
.collect(); | ||
let normalized_env = env | ||
.iter() | ||
// Only keep entries where both the key and value are valid UTF-8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this comment also mention why it is skipping over non-utf-8 values? Perhaps something like "Because the config env vars only support utf-8, this needs to be kept in sync with that, otherwise the normalized map warning could incorrectly warn about entries that can't be read by the config system." Or something shorter if you can word it better.
/// This is intended for use in private methods of `Config`, | ||
/// and does not check for env key case mismatch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this include a comment explaining why it is case-sensitive?
I was looking over the history, and it looks like this was an unintended regression in 1.28 via #5552. I think I somehow forgot about windows case sensitivity at that time, and have been confused on a few occasions (like #9169) due to the fact that Command
used to force keys to be uppercase on Windows (fixed in 1.55 via rust-lang/rust#85270). When launching cargo
from within cargo's testsuite, it used Command
which ended up mucking with the environment keys.
I don't think this needs to be fixed here in this PR. I don't think anyone has complained about this since then, so maybe it's best just to leave it.
So, a comment could explain something like:
This is case-sensitive on Windows (even though Windows is usually case-insensitive) due to an unintended regression in 1.28 (via #5552). This should only affect keys used for cargo's config-system env variables (
CARGO_
prefixed ones) which are currently all uppercase. We may want to consider rectifying it if users report issues. One thing that adds a wrinkle here is the unstable advanced-env option which requires case-sensitive keys.Do not use this for any other purposes. Use [
Env::get_env_os
] or [Env::get_env
] instead, which properly handle case insensitivity on Windows.
ae52cb2
to
7af55d8
Compare
Thanks for the helpful feedback and context! I added those comments you suggested in the latest commit. I also added a bit more documentation on |
Thank you for the write-up! We have really wanted to document every design rationale to help ourselves and contributors. BTW, CI is currently blocked. We are waiting for the next nightly release. |
Talked to ehuss and we both agree it is pretty good to merge! @bors r+ |
Thanks! It is much appreciated. @bors r+ |
💡 This pull request was already approved, no need to approve it again. |
Handle case mismatches when looking up env vars in the Config snapshot ### What does this PR try to resolve? Fixes #11814. Windows environment variables are case-insensitive, which causes problems when looking them up in the `Config` env snapshot. This PR adds another member (`case_insensitive_env`) in `Config` that maps upper-cased keys to their original values in the env (for example, `"PATH" => "Path"`). If lookup in `self.env` fails, this PR converts the key to upper case and looks it up in `self.case_insensitive_env` to obtain the correct key name if it exists (on Windows only). ### How should we test and review this PR? Please see the new tests in `testsuite/config.rs` and `testsuite/cargo_command.rs`. ### Additional information Currently, this uses `str::to_uppercase` to uppercase the keys. This requires key to be valid UTF-8, and may disagree with how the OS uppercases things (see the link in [this comment](#11814 (comment)) for details).
☀️ Test successful - checks-actions |
…=ehuss Handle case mismatches when looking up env vars in the Config snapshot ### What does this PR try to resolve? Fixes rust-lang#11814. Windows environment variables are case-insensitive, which causes problems when looking them up in the `Config` env snapshot. This PR adds another member (`case_insensitive_env`) in `Config` that maps upper-cased keys to their original values in the env (for example, `"PATH" => "Path"`). If lookup in `self.env` fails, this PR converts the key to upper case and looks it up in `self.case_insensitive_env` to obtain the correct key name if it exists (on Windows only). ### How should we test and review this PR? Please see the new tests in `testsuite/config.rs` and `testsuite/cargo_command.rs`. ### Additional information Currently, this uses `str::to_uppercase` to uppercase the keys. This requires key to be valid UTF-8, and may disagree with how the OS uppercases things (see the link in [this comment](rust-lang#11814 (comment)) for details).
…nglo [beta-1.69] cargo beta backports 3 commits in 9880b408a3af50c08fab3dbf4aa2a972df71e951..7b18c85808a6b45ec8364bf730617b6f13e0f9f8 2023-02-28 19:39:39 +0000 to 2023-03-17 12:29:33 +0000 - [beta-1.69] backport rust-lang/cargo#11824 (rust-lang/cargo#11863) - [beta-1.69] backport rust-lang/cargo#11820 (rust-lang/cargo#11823) - chore: Backport rust-lang/cargo#11630 to `1.69.0` (rust-lang/cargo#11806) r? `@ghost`
Update cargo 11 commits in 4a3c588b1f0a8e2dc8dd8789dbf3b6a71b02ed49..15d090969743630bff549a1b068bcaa8174e5ee3 2023-03-14 14:05:36 +0000 to 2023-03-21 17:54:28 +0000 - docs(contrib): Move higher level resolver docs into doc comments (rust-lang/cargo#11870) - docs(contrib): Pull impl info out of architecture (rust-lang/cargo#11869) - Update curl-sys (rust-lang/cargo#11871) - Poll loop fixes (rust-lang/cargo#11624) - clippy: warn `disallowed_methods` for `std::env::var` and friends (rust-lang/cargo#11828) - Add --ignore-rust-version flag to cargo install (rust-lang/cargo#11859) - Handle case mismatches when looking up env vars in the Config snapshot (rust-lang/cargo#11824) - align semantics of generated vcs ignore files (rust-lang/cargo#11855) - Add more information to wait-for-publish (rust-lang/cargo#11713) - docs: Address warnings (rust-lang/cargo#11856) - docs(contrib): Create a file overview in the nightly docs (rust-lang/cargo#11850)
Update cargo 11 commits in 4a3c588b1f0a8e2dc8dd8789dbf3b6a71b02ed49..15d090969743630bff549a1b068bcaa8174e5ee3 2023-03-14 14:05:36 +0000 to 2023-03-21 17:54:28 +0000 - docs(contrib): Move higher level resolver docs into doc comments (rust-lang/cargo#11870) - docs(contrib): Pull impl info out of architecture (rust-lang/cargo#11869) - Update curl-sys (rust-lang/cargo#11871) - Poll loop fixes (rust-lang/cargo#11624) - clippy: warn `disallowed_methods` for `std::env::var` and friends (rust-lang/cargo#11828) - Add --ignore-rust-version flag to cargo install (rust-lang/cargo#11859) - Handle case mismatches when looking up env vars in the Config snapshot (rust-lang/cargo#11824) - align semantics of generated vcs ignore files (rust-lang/cargo#11855) - Add more information to wait-for-publish (rust-lang/cargo#11713) - docs: Address warnings (rust-lang/cargo#11856) - docs(contrib): Create a file overview in the nightly docs (rust-lang/cargo#11850)
What does this PR try to resolve?
Fixes #11814.
Windows environment variables are case-insensitive, which causes problems when looking them up in the
Config
env snapshot.This PR adds another member (
case_insensitive_env
) inConfig
that maps upper-cased keys to their original values in the env (for example,"PATH" => "Path"
). If lookup inself.env
fails, this PR converts the key to upper case and looks it up inself.case_insensitive_env
to obtain the correct key name if it exists (on Windows only).How should we test and review this PR?
Please see the new tests in
testsuite/config.rs
andtestsuite/cargo_command.rs
.Additional information
Currently, this uses
str::to_uppercase
to uppercase the keys. This requires key to be valid UTF-8, and may disagree with how the OS uppercases things (see the link in this comment for details).