Text processing library
Curated hub for C++ text processing boundaries: text_encoding, Unicode/encoding identity, and bridges to strings, locale, regex, and formatting.
The C++ text area is intentionally narrow today. Its core standard-library surface is std::text_encoding, which gives programs a vocabulary for talking about ordinary, wide, literal, and environment encodings. This hub exists to route you quickly to that surface and to the adjacent libraries that actually own storage, parsing, matching, or locale behavior.
# Start Here
Identify an encoding at runtime or by environment
Start with std::text_encoding when you need to ask what encoding a literal, locale, or environment is using, or when you need a standard name/MIB-style identity for that encoding.
Understand language-level encodings and character sets
Use the language route when the issue is source/execution character sets, literal prefixes, or core-language encoding rules rather than a library object.
Choose a text storage or view model
Go to the strings hub when the job is storing, slicing, owning, viewing, or converting text data rather than identifying encodings.
Handle locale-sensitive text behavior
Use locale facilities when formatting, collation, classification, or conversions depend on cultural rules instead of just encoding identity.
# Quick Map
| If you need to... | Start with | Why |
|---|---|---|
| Name or compare a text encoding | text_encoding | It is the standard-library type dedicated to encoding identity and comparison. |
| Ask about the ordinary or wide literal environment | literal and environment | These entry points expose the predefined encoding categories that matter most when crossing between source text and runtime behavior. |
| Get stable names, ids, or aliases for an encoding | name, id, mib, aliases | These pages cover the metadata surfaces used to describe or compare encodings programmatically. |
| Work with actual text storage, lifetime, or slicing | Strings library | The text hub does not own containers or views; that belongs to the strings library. |
| Match, search, or replace text patterns | Regular expressions library | Regex owns pattern matching; text encoding identity is only adjacent context there. |
| Format text for output | format | Formatting chooses presentation; the text hub mainly clarifies encoding boundaries and naming. |
# Text Encoding Surfaces
| Surface | Primary destinations | Use it for |
|---|---|---|
| Core encoding object | text_encoding, hash, comparison | Representing and comparing encoding identities in code. |
| Environment and literal categories | environment, literal, environment_is | Checking which encodings are associated with runtime or literal contexts. |
| Metadata and naming | name, comp-name, id, mib | Obtaining standardized names, compact names, and identifier forms for an encoding. |
| Alias views | aliases, aliases_view | Working with alternative naming forms associated with one encoding. |
# Text Vs. String Vs. Locale Vs. Regex
| If your real question is about... | Go here | Why |
|---|---|---|
| Encoding identity or environment/literal encoding categories | text_encoding | This is the one place in the standard library text area dedicated to encoding identity itself. |
| Owning text, string views, conversions, or character traits | Strings library | String storage and manipulation live in the strings library, not in `/cpp/text/`. |
| Localization, collation, facets, and cultural formatting behavior | locale | Locale owns culture-sensitive text behavior beyond raw encoding identity. |
| Pattern matching and replacement | Regular expressions library | Regex owns searching, matching, captures, and tokenization. |
| Literal prefixes, source/execution encodings, or character-set rules in the language | Character sets and string literals | Those are core-language topics rather than library navigation topics. |
# Standard Evolution
| Standard | Navigation note |
|---|---|
| Before C++26 | Most text-navigation questions were really redirected into string, locale, or language-level encoding pages because there was no dedicated standard-library text hub surface. |
| C++26 | text_encoding turns encoding identity into an explicit library destination, making `/cpp/text/` a real route instead of just an adjacency label. |
# Practical Routes
I need the encoding object itself
Start here for the core API, comparisons, and metadata attached to a concrete encoding identity.
I am really asking about character sets
Go here when the problem is source code, execution encodings, or literal interpretation rules.
I need text containers or views
Use the strings hub for text ownership, slicing, conversions, and character-traits-oriented APIs.
I need search or matching
If the task is pattern matching instead of encoding identity, regex is the correct home.
# Boundary Lines
| This hub covers | This hub does not try to cover |
|---|---|
| Encoding identity, encoding metadata, and where text-related navigation should branch next. | General string manipulation, locale facet design, I/O formatting APIs, or regex algorithm details. |