Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(multipart): add max_length to secure_filename() #2421

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions falcon/media/multipart.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a newsfragment for this improvement.

Original file line number Diff line number Diff line change
Expand Up @@ -592,6 +592,15 @@ class MultipartParseOptions:
If the body part headers size exceeds this value, an instance of
:class:`.MultipartParseError` will be raised.
"""
max_filename_length: Optional[int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new length is not passed to the actual invocation of secure_filename() when referencing the corresponding part property.

Also add tests (make sure to parametrize on both WSGI/ASGI) for actually modifying this parsing option, and verifying that it is respected when referencing the said property.

"""The maximum length of filenames in multipart forms (default ``None``).

If the filename exceeds this value, it will be truncated while preserving
the file extension. Defaults to ``None`` for backward compatibility.

.. note::
Starting from Falcon 5.0, this will default to 255 characters.
Copy link
Member

@vytas7 vytas7 Dec 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a smaller default value. On some systems, e.g., Windows, the max possible path length is around 255-260, so likely we will want to make the file name part of the path shorter than that.

Maybe 64 could be a sane default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 may be a bit small as default, but 96/128 is likely ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think 64 ought to be enough for the basename, since secure_filename() is mangling the name in any case, so there is no guarantee to get exactly the same string as submitted in the form anyway. But 96 works for me as well.

"""
media_handlers: Handlers
"""A dict-like object for configuring the media-types to handle.

Expand Down
13 changes: 11 additions & 2 deletions falcon/util/misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import re
from typing import Any, Callable, Dict, List, Mapping, Optional, Tuple, Union
import unicodedata
import os

from falcon import status_codes
from falcon.constants import PYPY
Expand Down Expand Up @@ -336,7 +337,7 @@
return args


def secure_filename(filename: str) -> str:
def secure_filename(filename: str, max_length: int = None) -> str:
"""Sanitize the provided `filename` to contain only ASCII characters.

Only ASCII alphanumerals, ``'.'``, ``'-'`` and ``'_'`` are allowed for
Expand All @@ -357,6 +358,8 @@
Args:
filename (str): Arbitrary filename input from the request, such as a
multipart form filename field.
max_length (int, optional): Maximum length of the sanitized filename,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (int, optional) looks a bit unconventional. I think you can simply remove this type from here altogether, and let Sphinx pick it up from the type annotation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to pass the max_length as a parameter in the BodyPart.secure_filename function or set it as an instance or class variable and reference it from there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want us to actually use the new option in BodyPart.secure_filename(), i.e.:

    @property
    def secure_filename(self) -> str:
        """The sanitized version of `filename` using only the most common ASCII
        <...>
        """  # noqa: D205
        try:
            return misc.secure_filename(self.filename or '')
        except ValueError as ex:
            raise MultipartParseError(description=str(ex)) from ex

⬇️

    @property
    def secure_filename(self) -> str:
        """The sanitized version of `filename` using only the most common ASCII
        <...>
        """  # noqa: D205
        try:
            return misc.secure_filename(self.filename or '', max_length=self._parse_options.max_secure_filename_length)
        except ValueError as ex:
            raise MultipartParseError(description=str(ex)) from ex

(untested)

including the file extension. If None, no truncation is applied.

Returns:
str: The sanitized filename.
Expand All @@ -373,7 +376,13 @@
filename = unicodedata.normalize('NFKD', filename)
if filename.startswith('.'):
filename = filename.replace('.', '_', 1)
return _UNSAFE_CHARS.sub('_', filename)
filename = _UNSAFE_CHARS.sub('_', filename)

if max_length is not None and len(filename) > max_length:
name, ext = os.path.splitext(filename)
filename = name[:max_length - len(ext)] + ext

Check warning on line 383 in falcon/util/misc.py

View check run for this annotation

Codecov / codecov/patch

falcon/util/misc.py#L382-L383

Added lines #L382 - L383 were not covered by tests
CaselIT marked this conversation as resolved.
Show resolved Hide resolved

return filename
vytas7 marked this conversation as resolved.
Show resolved Hide resolved


@_lru_cache_for_simple_logic(maxsize=64)
Expand Down
Loading