Skip to content
/ idna Public

C++ library implementing the to_ascii and to_unicode functions from the Unicode Technical Standard.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

ada-url/idna

Repository files navigation

Unicode IDNA

OpenSSF Scorecard Badge VS17-CI Alpine Linux Alpine Linux

The ada-url/idna library is a C++ library implementing the to_ascii and to_unicode functions from the Unicode Technical Standard supporting a wide range of systems. It is suitable for URL parsing.

Our IDNA library is used by the Node.js runtime and by ClickHouse.

According to our benchmarks, it can be faster than ICU.

Requirements

  • A recent C++ compiler supporting C++20. We test GCC 12 or better, LLVM 12 or better and Microsoft Visual Studio 2022.

Usage

std::string_view input = u8"meßagefactory.ca";// non-empty UTF-8 string, must be percent decoded
std::string idna_ascii = ada::idna::to_ascii(input);
if(idna_ascii.empty()) {
    // There was an error.
}
std::cout << idna_ascii << std::endl;
// outputs 'xn--meagefactory-m9a.ca' if the input is u8"meßagefactory.ca"

Benchmarks

You may build a benchmarking tool with the library as follows under macOS and Linux:

cmake -D ADA_IDNA_BENCHMARKS=ON -B build
cmake --build build
./build/benchmarks/to_ascii

The commands for users of Visual Studio are slightly different.

Sample result (LLVM 14, Apple M1 Max processor):

---------------------------------------------------------------------
Benchmark           Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------
Ada              1504 ns         1504 ns       440984 speed=48.5371M/s time/byte=20.6028ns time/domain=250.667ns url/s=3.98935M/s
Icu              1898 ns         1897 ns       369967 speed=38.4721M/s time/byte=25.9928ns time/url=316.246ns url/s=3.16209M/s

License

This code is made available under the Apache License 2.0 as well as the MIT license.

Our tests include third-party code and data. The benchmarking code includes third-party code: it is provided for research purposes only and not part of the library.

About

C++ library implementing the to_ascii and to_unicode functions from the Unicode Technical Standard.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published

Languages