Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(multipart): add max_length to secure_filename() #2421

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions falcon/media/multipart.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a newsfragment for this improvement.

Original file line number Diff line number Diff line change
Expand Up @@ -592,6 +592,15 @@ class MultipartParseOptions:
If the body part headers size exceeds this value, an instance of
:class:`.MultipartParseError` will be raised.
"""
max_filename_length: Optional[int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new length is not passed to the actual invocation of secure_filename() when referencing the corresponding part property.

Also add tests (make sure to parametrize on both WSGI/ASGI) for actually modifying this parsing option, and verifying that it is respected when referencing the said property.

"""The maximum length of filenames in multipart forms (default ``None``).

If the filename exceeds this value, it will be truncated while preserving
the file extension. Defaults to ``None`` for backward compatibility.

.. note::
Starting from Falcon 5.0, this will default to 255 characters.
Copy link
Member

@vytas7 vytas7 Dec 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a smaller default value. On some systems, e.g., Windows, the max possible path length is around 255-260, so likely we will want to make the file name part of the path shorter than that.

Maybe 64 could be a sane default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 may be a bit small as default, but 96/128 is likely ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think 64 ought to be enough for the basename, since secure_filename() is mangling the name in any case, so there is no guarantee to get exactly the same string as submitted in the form anyway. But 96 works for me as well.

"""
media_handlers: Handlers
"""A dict-like object for configuring the media-types to handle.

Expand Down
16 changes: 14 additions & 2 deletions falcon/util/misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
import functools
import http
import inspect
import os
import re
from typing import Any, Callable, Dict, List, Mapping, Optional, Tuple, Union
import unicodedata
Expand Down Expand Up @@ -336,7 +337,7 @@ def get_argnames(func: Callable[..., Any]) -> List[str]:
return args


def secure_filename(filename: str) -> str:
def secure_filename(filename: str, max_length: Optional[int] = None) -> str:
"""Sanitize the provided `filename` to contain only ASCII characters.

Only ASCII alphanumerals, ``'.'``, ``'-'`` and ``'_'`` are allowed for
Expand All @@ -357,6 +358,8 @@ def secure_filename(filename: str) -> str:
Args:
filename (str): Arbitrary filename input from the request, such as a
multipart form filename field.
max_length: Maximum length of the sanitized filename,
including the file extension. If None, no truncation is applied.

Returns:
str: The sanitized filename.
Expand All @@ -373,7 +376,16 @@ def secure_filename(filename: str) -> str:
filename = unicodedata.normalize('NFKD', filename)
if filename.startswith('.'):
filename = filename.replace('.', '_', 1)
return _UNSAFE_CHARS.sub('_', filename)
filename = _UNSAFE_CHARS.sub('_', filename)

if max_length is not None and len(filename) > max_length:
name, ext = os.path.splitext(filename)
if max_length - len(ext) >= 1:
filename = name[: max_length - len(ext)] + ext
else:
filename = filename[:max_length]

return filename
vytas7 marked this conversation as resolved.
Show resolved Hide resolved


@_lru_cache_for_simple_logic(maxsize=64)
Expand Down
11 changes: 11 additions & 0 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -628,6 +628,17 @@ def test_misc_isascii(self):
with pytest.warns(deprecation.DeprecatedWarning):
assert misc.isascii('foobar')

@pytest.mark.parametrize(
'filename,max_length,expected',
[
('averylongfilename.txt', 17, 'averylongfile.txt'),
('short.txt', 10, 'short.txt'),
('file.withlongextension', 9, 'file.with'),
],
)
def test_secure_filename_max_length(self, filename, max_length, expected):
assert misc.secure_filename(filename, max_length=max_length) == expected


@pytest.mark.parametrize(
'protocol,method',
Expand Down
Loading