Text processing library

Curated hub for C++ text processing boundaries: text_encoding, Unicode/encoding identity, and bridges to strings, locale, regex, and formatting.

The C++ text area is intentionally narrow today. Its core standard-library surface is std::text_encoding, which gives programs a vocabulary for talking about ordinary, wide, literal, and environment encodings. This hub exists to route you quickly to that surface and to the adjacent libraries that actually own storage, parsing, matching, or locale behavior.

Use this page when the question is about encoding identity or about where text-related concerns belong. Keep string for owning and non-owning text containers, locale for localization and cultural formatting, regex for pattern matching, and format for output formatting.

# Start Here

Identify an encoding at runtime or by environment

Start with std::text_encoding when you need to ask what encoding a literal, locale, or environment is using, or when you need a standard name/MIB-style identity for that encoding.

Understand language-level encodings and character sets

Use the language route when the issue is source/execution character sets, literal prefixes, or core-language encoding rules rather than a library object.

Choose a text storage or view model

Go to the strings hub when the job is storing, slicing, owning, viewing, or converting text data rather than identifying encodings.

Handle locale-sensitive text behavior

Use locale facilities when formatting, collation, classification, or conversions depend on cultural rules instead of just encoding identity.

# Quick Map

If you need to...Start withWhy
Name or compare a text encodingtext_encodingIt is the standard-library type dedicated to encoding identity and comparison.
Ask about the ordinary or wide literal environmentliteral and environmentThese entry points expose the predefined encoding categories that matter most when crossing between source text and runtime behavior.
Get stable names, ids, or aliases for an encodingname, id, mib, aliasesThese pages cover the metadata surfaces used to describe or compare encodings programmatically.
Work with actual text storage, lifetime, or slicingStrings libraryThe text hub does not own containers or views; that belongs to the strings library.
Match, search, or replace text patternsRegular expressions libraryRegex owns pattern matching; text encoding identity is only adjacent context there.
Format text for outputformatFormatting chooses presentation; the text hub mainly clarifies encoding boundaries and naming.

# Text Encoding Surfaces

SurfacePrimary destinationsUse it for
Core encoding objecttext_encoding, hash, comparisonRepresenting and comparing encoding identities in code.
Environment and literal categoriesenvironment, literal, environment_isChecking which encodings are associated with runtime or literal contexts.
Metadata and namingname, comp-name, id, mibObtaining standardized names, compact names, and identifier forms for an encoding.
Alias viewsaliases, aliases_viewWorking with alternative naming forms associated with one encoding.

# Text Vs. String Vs. Locale Vs. Regex

If your real question is about...Go hereWhy
Encoding identity or environment/literal encoding categoriestext_encodingThis is the one place in the standard library text area dedicated to encoding identity itself.
Owning text, string views, conversions, or character traitsStrings libraryString storage and manipulation live in the strings library, not in `/cpp/text/`.
Localization, collation, facets, and cultural formatting behaviorlocaleLocale owns culture-sensitive text behavior beyond raw encoding identity.
Pattern matching and replacementRegular expressions libraryRegex owns searching, matching, captures, and tokenization.
Literal prefixes, source/execution encodings, or character-set rules in the languageCharacter sets and string literalsThose are core-language topics rather than library navigation topics.

# Standard Evolution

StandardNavigation note
Before C++26Most text-navigation questions were really redirected into string, locale, or language-level encoding pages because there was no dedicated standard-library text hub surface.
C++26text_encoding turns encoding identity into an explicit library destination, making `/cpp/text/` a real route instead of just an adjacency label.

# Practical Routes

I need the encoding object itself

Start here for the core API, comparisons, and metadata attached to a concrete encoding identity.

I am really asking about character sets

Go here when the problem is source code, execution encodings, or literal interpretation rules.

I need text containers or views

Use the strings hub for text ownership, slicing, conversions, and character-traits-oriented APIs.

I need search or matching

If the task is pattern matching instead of encoding identity, regex is the correct home.

# Boundary Lines

This hub coversThis hub does not try to cover
Encoding identity, encoding metadata, and where text-related navigation should branch next.General string manipulation, locale facet design, I/O formatting APIs, or regex algorithm details.