Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use tzinfo instead of str to identify the TimeZone in ZonedDateTime #65

Open
bxparks opened this issue Feb 22, 2024 · 3 comments
Open
Labels
discussion Discussion is needed before proceeding

Comments

@bxparks
Copy link

bxparks commented Feb 22, 2024

The ZonedDateTime class uses a str to identify the timezone. Please consider using the standard tzinfo class instead. Using tzinfo allows alternative timezone libraries to be used instead of the one baked into whatever (presumably zoneinfo).

I have my own timezone Python library. It provides 2 different implementations of tzinfo. Both are drop-in replacements for zoneinfo, pytz, or any other Python timezone libraries. Given the current implementation of ZonedDateTime, I cannot use my own timezone library with whatever.

Why do I want to use my own? Because zoneinfo has bugs in obscure edge cases. (pytz has even more bugs, as well as the year 2038 problem). My libraries don't.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Feb 23, 2024

Interesting! You'll agree though that the 'customized tzinfo class' use case is niche. My general approach to these kind of issues are to allow them if the API doesn't become too complex for the 'average' user.

What immediately comes to mind is to allow tz= argument of ZonedDateTime to be str | tzinfo. However, this will affect the API:

  • the .tz attribute is no longer guaranteed to output a IANA timezone name.
  • serializing to/from a canonical string format is no longer possible.

Thes are pretty big downsides IMHO, as established identifiers are key for interoperability.

Another potential solution is to provide a CustomDateTime class (perhaps in a separate whenever.advanced package), which has a tzinfo= argument. This class is an AwareDateTime subclass but then doesn't have a canonical string representation.

Because zoneinfo has bugs in obscure edge cases.

Pardon my ignorance, but: shouldn't these be fixed there then? Are these bugs with the Python library, or the actual IANA tz db?

Alternatively, isn't it possible to modify the zoneinfo files on your system yourself if you know better? Then the ZoneInfo package would automatically adjust to your needs.

@ariebovenberg ariebovenberg added the discussion Discussion is needed before proceeding label Feb 23, 2024
@bxparks
Copy link
Author

bxparks commented Feb 24, 2024

I'll address your questions about the zoneinfo library in a separate post. It's complicated.

I think the least desirable solution is a CustomDateTime. Your library already has too many subclasses of AwareDateTime. :-)

I agree that using a non-IANA timezone identifier may be niche, but I think the need or desire to use a different timezone library may be more common than you might think:

  1. Backwards compatibility An app may want to upgrade from datetime to whenever, but keep using pytz, for backwards compatibility with previous versions of the app. It may want to migrate to zoneinfo eventually, but do the migration in phases, first datetime->whenever then to pytz->zoneinfo.
  2. Cross compatibility A system may be composed of multiple Python apps, servlets, modules, etc. One part of the system may be on pytz and need to remain there for some time. A different part of the system may need to continue to use pytz for cross compatibility.
  3. Deterministic Reproducibility An app may need to pin a specific version of zoneinfo or pytz for integration testing and validation.
  4. Better timezone library A new timezone library that is faster, smaller, and more accurate than zoneinfo may be created in the future.
  5. zoneinfo is not available In resource-constrained, bare-metal (no OS) embedded environments, the zoneinfo library may be too large and not available. Smaller timezone libraries may need to be used. (In theory, my timezone libraries should fit into something on the order of 32-96 kB, instead of the multi-megabytes required by zoneinfo).

Proposed Solution

Here is my solution that I think is compatible with your current constraints on ZonedDateTime. My solution would continue use tz: str parameter in ZonedDateTime. But instead of hardcoding the dependency to zoneinfo, we add a layer of indirection (... "all CS problems can be solved with a layer of indirection"...), whose purpose is to provide an instance of tzinfo from its str:

(Sorry for any dumb syntax errors below, I haven't done much Python coding in several years...)

import zoneinfo
from typing_extensions import Protocol
[...]

class TzProvider(Protocol):
    def gettz(name: str) -> tzinfo:
         ...

class ZoneInfoProvider(TzProvider):
    def gettz(name: str) -> tzinfo:
        return zoneinfo.ZoneInfo(name)

Then at the whenever module level, we can provide hooks to replace the default TzProvider:

_tz_provider = ZoneInfoProvider()

def set_tz_provider(provider: TzProvider) -> None:
    _tz_provider = provider

def get_tz_provider() -> TzProvider:
    """Probably useful only for testing purposes"""
    return _tz_provider

Interesting Consequences

I realize that a pluggable TzProvider infrastructure is getting into Enterprise Java territory, but sometimes, Enterprise Java uses certain techniques because it actually solves real problems.

A pluggable TzProvider achieves the decoupling of the DateTime related classes and the TimeZone related classes, as I mentioned in #30, when I was speculating that the reason that LocalDateTime/SystemDateTime was needed was because there is too much coupling between "date/time" concepts and "timezone" concepts.

One interesting consequence is that this solution is that it allows whenever to support POSIX-style timezone rules naturally:

from whenever import ZonedDateTime
from whenever import TzProvider
from whenever import set_tz_provider

class PosixProvider(TzProvider):
    def gettz(name: str):
        # ...

posix_provider = PosixProvider()
set_tz_provider(posix_provider)

tz = ZonedDateTime(tz="NZST-12NZDT,M9.5.0,M4.1.0/3")

Implications for Pickling

I don't actually understand the requirements of Python pickling, so I don't know what is required of the ZonedDateTime class to support this pickling thing. Seems to me that it would be up to the specific TzProvider instance to decide whether it can handle a special, non-IANA, name identifier.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Feb 25, 2024

Lots to think about.

My impression is that most issues are solved by:

  • referring users to the inherent flexibility of zoneinfo itself as a tz provider
  • allowing ZoneInfo instances to be passed directly to ZonedDateTime

zoneinfo as a tz provider

As far as I can see, zoneinfo already supports almost everything you'd need from a tz provider:

  • Want to save space? Ship your system with fewer or smaller zoneinfo files
  • Want to pin a specific tz db version? Set TZPATH or using the tzdata PyPI package
  • Want to add your own timezone? Add it to the zoneinfo files, or use ZoneInfo.from_file()
  • zoneinfo not in your slimmed-down Python version? Use backports.zoneinfo.

What is zoneinfo itself it too large for your system?

I'm not familiar with using Python in super-constrained environments, but the zoneinfo library itself is only about 1k lines. And there's also a compiled C version that is probably smaller.

I'd feel safe saying that if 1k lines of code make a difference, then users are probably better off sticking to the standard library.

What about other timezone libraries?

I'm not yet convinced of the need to support alternate libraries. The IANA tz db (accessible through zoneinfo) is the overwhelming worldwide standard. AFAIK there is currently no reason to choose an alternative:

  • In case you're on dateutil.tz or pytz, you can/should probably migrate to zoneinfo anyway.
  • A hypothetical better future standard/library shouldn't clutter the API. Let's cross the bridge when we come to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion is needed before proceeding
Projects
None yet
Development

No branches or pull requests

2 participants