| Lesson 3 |
Type extensibility |
| Objective |
Describe the Usefulness of Adding user-defined Types to C++ |
C++ Type Extensibility (User-Defined Types and Abstract Data Types Explained)
Type extensibility is the ability to add user-defined types to C++ so that new types are as easy to use as native types. Lesson 2 established the four pillars of OOP — encapsulation, inheritance, polymorphism, and abstraction — and introduced classes as the mechanism for creating user-defined types. This lesson examines why that capability matters: what an Abstract Data Type is, what distinguishes a well-designed class from a poorly designed one, and how the C++ type system allows user-defined types to participate in expressions, operators, and syntax exactly as native types do. The standard library itself — std::string, std::vector, std::map — is the most comprehensive demonstration of type extensibility in existence. Every time a programmer uses + to concatenate strings or [] to index a vector, they are benefiting from user-defined types that behave exactly like native types.
What Type Extensibility Means in C++
Native Types vs User-Defined Types
C++ provides a set of native fundamental types — int, double, char, bool, and their variants — that the language and hardware support directly. These types participate in arithmetic expressions, can be assigned, compared, passed to functions, and returned from functions. Type extensibility means that programmer-defined types — classes — can participate in all the same contexts with equal syntactic convenience. A Fraction class that supports +, -, *, and / can appear in arithmetic expressions exactly like double. A Matrix class that supports * for matrix multiplication can be multiplied with the same operator syntax as two integers. The user-defined type becomes a first-class citizen of the type system.
The Standard Library as Type Extensibility in Action
The C++ standard library is entirely implemented as user-defined types — none of its components are built into the language itself.
std::string is a class, not a native type.
std::vector is a class template.
std::map,
std::shared_ptr,
std::optional — all classes, all user-defined. Yet they are used with the same ease as fundamental types:
std::string greeting = "Hello";
std::string name = "World";
std::string message = greeting + ", " + name + "!"; // operator+ on user-defined type
std::vector<int> numbers = {3, 1, 4, 1, 5, 9};
int first = numbers[0]; // operator[] on user-defined type
numbers.push_back(2);
// Both work exactly like native types in expressions, loops, and function calls
This is the practical meaning of type extensibility — the programmer who uses
std::string does not need to know that it is a user-defined class rather than a native type. The interface is as natural and easy to use as
int.
User-Defined Literals — Types as First-Class Citizens
C++11 introduced user-defined literals, which allow user-defined types to participate in literal syntax — the same syntax used for native type literals like
42,
3.14, and
'A'. By defining a
operator"" suffix, a programmer can write
42_km to construct a
Distance object,
3.5_s to construct a
Duration, or
"hello"_bs to construct a
BoundedString. The user-defined type becomes syntactically indistinguishable from a native type at the point of use:
// User-defined literals for a Distance type
Distance d1 = 42_km;
Distance d2 = 100_m;
Distance total = d1 + d2; // operator+ combines distances correctly
// std::chrono uses this mechanism in the standard library
using namespace std::chrono_literals;
auto timeout = 500ms; // std::chrono::milliseconds(500)
auto delay = 2s; // std::chrono::seconds(2)
The standard library's
std::chrono duration literals —
ms,
s,
min,
h — are the most widely used example. Type extensibility reaches its fullest expression when a user-defined type is not just usable but indistinguishable from a native type in every context.
Abstract Data Types — The Goal of User-Defined Types
The ADT Definition
An Abstract Data Type (ADT) is an extension to the native types available in C++. It consists of a set of values and a collection of operations that can act on those values. An ADT is a description of the ideal public behavior of the type. A data type is called an abstract data type if the programmers who use the type do not have access to the details of how the values and operations are implemented. The abstraction is in the separation: the ADT defines what the type does, not how it does it. Callers interact with the what; the implementation handles the how.
In modern C++20 terminology, the ADT concept is formalized through Concepts — compile-time interface constraints that specify exactly what operations a type must support without specifying how those operations are implemented. A Concept like std::ranges::range defines what it means to be a range (iterable, with begin and end) without constraining the internal representation. This is the ADT idea expressed as a language feature rather than a design principle.
int as an ADT — The Key Insight
The predefined types such as int are abstract data types. You do not know how the operations such as + and * are implemented for the type int. Even if you did know — even if you understood exactly how the CPU's ALU performs two's complement integer addition — you would not use this information in any C++ program. The interface (+, -, *, /, comparisons) is what you depend on; the implementation (hardware instruction encoding, register allocation) is completely hidden. This is the model that every well-designed user-defined type should aspire to: an interface that is so natural and complete that users never need to think about the implementation. The user-defined type succeeds as an ADT when callers use it the same way they use int — without ever looking behind the interface.
The Interface vs Implementation Separation
The ADT contract has two sides. The interface defines what the type can do — the public member functions, operators, and conversions that callers can invoke. The implementation defines how it does it — the private data members, the algorithms, the memory management strategy. Callers depend only on the interface; the implementation can change completely without affecting a single line of calling code. A string class might initially store characters in a fixed-size array, then be reimplemented to use dynamic allocation, then reimplemented again to use SSO (Small String Optimization) — and none of these changes affect callers, because callers never accessed the internal storage directly. This separation is what makes ADTs maintainable at scale: the interface is a stable contract, and the implementation is a changeable detail.
What Makes a Class a True ADT
All Data Members Private
The first requirement of a true ADT is that all data members are private. If a data member is public, callers can read and write it directly — bypassing any invariant checking, bypassing any controlled update logic, and creating a dependency on the internal representation that prevents the implementation from changing. A class with public data members is not an ADT; it is a C struct with methods attached. Programmer-defined types, such as structure types and class types, are not automatically ADTs. Unless they are defined and used with care, programmer-defined types can be used in unintuitive ways that make a program difficult to understand and difficult to modify. The best way to avoid these problems is to make sure all the data types you define are ADTs.
The Public Interface Expresses the Abstraction
The public member functions of an ADT should express operations that make sense for the abstraction — not operations that reflect how the implementation works. A stack ADT exposes push(), pop(), top(), empty(), and full(). It does not expose getInternalArrayPointer() or setTopIndex() — those are implementation details that have no meaning at the level of the abstraction. The public interface is the vocabulary of the abstraction; if a function name only makes sense to someone who knows how the class is implemented, it does not belong in the public interface.
The Class Invariant Must Be Maintained
The way to make a class an ADT in C++ is to use classes with private data and carefully designed public member functions — but not every class is an ADT. To make the class an ADT, you must define the class so that every public member function maintains the class invariant — the condition that must be true for every object of the type to be in a valid state. For a stack, the invariant is that the internal index is always within [-1, capacity-1]. For a BoundedString, the invariant is that the stored string never exceeds max_length characters. Every public function that modifies the object must leave it in a state satisfying the invariant; every public function that reads the object can rely on the invariant being true when it is called.
The String ADT — A Concrete Example
What the User of a String Knows
An example of a simple ADT is the string. The user of a string knows that operations such as concatenate or print result in certain public behavior. A concrete implementation of the ADT also has implementation limits — for example, strings might be limited in size — and these limits affect public behavior. The user knows the interface: concatenation with +, length with .length(), character access with [], comparison with ==. These are the operations the abstraction provides.
What the User Does Not Need to Know
The internal or private details of the implementation do not directly affect the user's understanding. For example, a string is frequently implemented as an array. The internal base address of this array and its name should be of no direct consequence to the user. Whether std::string stores short strings directly in the object (SSO) or allocates them on the heap, whether it uses reference counting or value semantics, whether the buffer is null-terminated or length-prefixed — none of this affects how the string is used. The ADT abstracts these details away completely.
A BoundedString ADT in C++23
If we want to create a string type that differs from the string type already available in C++, we can easily do so. For example, we could create a variation on the string type that has a certain length limit and has the ability to print itself out backwards and capitalize every other letter. The following BoundedString class implements this as a true ADT — private data, public interface, invariant enforcement:
#include <string>
#include <stdexcept>
#include <algorithm>
#include <cctype>
#include <iostream>
class BoundedString {
public:
explicit BoundedString(std::size_t max_length)
: max_length_{max_length}
{
if (max_length_ == 0)
throw std::invalid_argument("max_length must be positive");
}
// Append characters — enforces the length limit (the invariant)
void append(const std::string& text) {
if (data_.size() + text.size() > max_length_)
throw std::length_error("BoundedString: length limit exceeded");
data_ += text;
}
// Print the string backwards — a behavior unique to this ADT
void print_reversed() const {
std::string rev{data_.rbegin(), data_.rend()};
std::cout << rev << '\n';
}
// Capitalize every other character (0-indexed: positions 1, 3, 5, ...)
[[nodiscard]] std::string alternate_caps() const {
std::string result = data_;
for (std::size_t i = 1; i < result.size(); i += 2)
result[i] = static_cast<char>(std::toupper(
static_cast<unsigned char>(result[i])));
return result;
}
[[nodiscard]] std::size_t length() const noexcept { return data_.size(); }
[[nodiscard]] std::size_t max_length() const noexcept { return max_length_; }
[[nodiscard]] bool empty() const noexcept { return data_.empty(); }
[[nodiscard]] bool full() const noexcept {
return data_.size() == max_length_;
}
private:
std::string data_; // internal representation — invisible to callers
std::size_t max_length_; // the length limit that defines this ADT
};
// Usage — callers use the public interface, never the private data
int main() {
BoundedString bs{20};
bs.append("hello world");
bs.print_reversed(); // prints: dlrow olleh
std::cout << bs.alternate_caps() << '\n'; // prints: hElLo wOrLd
// bs.data_ = "hack"; // compile error: data_ is private
}
Programmer-Defined Types That Are Not ADTs
The Risk of Unintuitive Access
The contrast between an ADT and a non-ADT is most visible in C's approach to data structures. In C, a
struct is a plain data aggregate — all fields are public, and any function in the program can read or write any field directly:
// C-style: NOT an ADT — all fields publicly accessible
struct BoundedStr_C {
char data[21]; // directly accessible — caller can write past the bound
size_t max_length; // directly accessible — caller can change the limit
size_t length; // directly accessible — caller can set an invalid length
};
// Any caller can corrupt the invariant:
BoundedStr_C s;
s.length = 9999; // no compile error, immediate undefined behavior
// C++ class as ADT: private data enforces the contract
// BoundedString bs{20};
// bs.data_ = "hack"; // compile error — invariant cannot be violated
The C struct gives callers the ability to corrupt the structure's internal state without any warning. The C++ class as ADT makes corruption a compile-time error.
How Access Specifiers Enforce the ADT Contract
Access specifiers — public, private, protected — are the C++ language mechanism for enforcing the ADT contract. Private data members guarantee that the only way to change the object's state is through the public member functions, each of which maintains the invariant. The compiler enforces this at compile time — there is no runtime cost and no possibility of accidental bypass. This is the C++ answer to the question of how to make a class a true ADT: use private for all data members and design the public interface to express the abstraction cleanly and completely. Lessons 9 through 11 of this module examine access specifiers, the ADT interface, and the black box principle in detail.
Type Extensibility and the C++ Type System
Operator Overloading — Types in Expressions
One of the most powerful mechanisms of type extensibility is operator overloading — defining what +, -, *, ==, <<, and other operators mean for a user-defined type. A Fraction class with overloaded arithmetic operators can be used in expressions like a + b * c with the same natural syntax as integers. A Matrix class with overloaded * enables readable linear algebra code. The << operator overloaded for output allows user-defined types to work with std::cout exactly like int and double. Operator overloading is what completes type extensibility — it allows user-defined types to be not just usable but syntactically natural in every context where native types appear.
std::string as the Canonical ADT
std::string is the canonical example of a successful ADT in C++. Callers use concatenation (+), length (.length()), character access ([]), comparison (==, <), and stream output (<<) without knowing anything about the internal storage. The implementation has changed significantly across C++ standard revisions — SSO was added, the copy-on-write optimization was removed in C++11, constexpr support was added in C++20 — and none of these changes broke existing code because all existing code used only the public interface. This is the practical payoff of the ADT discipline: implementations can evolve, be optimized, and be replaced entirely, and the users of the ADT are unaffected.
In the next lesson, you will learn about encapsulation — how packing data and behavior together in a class creates the foundation for all of the ADT principles covered in this lesson.
