Obtaining type name strings

Table of contents

TL;DR:

Getting the type names as strings can be useful: for debugging purposes, for registering classes in factories, etc.

There are two methods of obtaining those names:

  1. At runtime, using typeid(T).name() and __cxa_demangle().
  2. At compile-time, using std::source_location::current().function_name().

Those are described in more detail below.

Both produce compiler-dependent names. (1) gives the same names on GCC and Clang, but MSVC uses a different format (this includes Clang when in MSVC-compatible mode). (2) gives prettier, but even more compiler-dependent names.

To get names that are more consistent across compilers, you can use my library cppdecl, which applies some heuristics to normalize those names as much as possible.

When reflection gets implemented, we’ll have another way of obtaining compile-time names, as an alterantive to (2).

At runtime, using typeid(T).name()

This is the first thing that people see when googing the topic, and in line with C++ traditions it does the wrong thing by default:

#include <iostream>
#include <typeinfo>

int main()
{
    std::cout << typeid(int).name() << '\n';
}

This prints:

  • "i' on GCC and Clang.
  • "int" on MSVC (and on Clang when in MSVC-compatible mode; everywhere in this section about typeid, when I say “MSVC”, I imply also Clang in MSVC-compat mode).

Is i the first letter of the type? No. Let’s look at some other names:

  • MyClass (assuming class MyClass {};) gives "7MyClass" on GCC/Clang, "class MyClass" on MSVC.
  • std::string gives "NSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE" on GCC/Clang and "class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >" on MSVC.

What GCC and Clang print are the mangled names, which aren’t intended to be human-readable, and are used for some internal purposes (such as for type comparisons for dynamic_casts in some cases).

On MSVC, you can use the non-standard typeid(T).raw_name() to get mangled names, which use a different format. Here’s one for std::string: ".?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@".

If you just need some strings e.g. as map keys (not necessarily human-readable), you might be tempted to use mangled names, but in those cases std::type_index is usually a better option.

Demangling the names

How do you get human-readable names on GCC and Clang? You demangle them.

There are command-line utilities for this:

  • Clang’s llvm-cxxfilt.
  • c++filt from binutils.

Running llvm-cxxfilt -t "NSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE" (or c++filt -t ...) prints std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (std::string is a typedef for this type).

But calling those from your program just to demangle the names isn’t the best idea.

GCC/Clang come with a demangling library. Include <cxxabi.h> and call abi::__cxa_demangle().

Here is a little wrapper class for it:

class Demangler
{
    #ifndef _MSC_VER
    // We keep those here so that `abi::__cxa_demangle()` can reuse its buffer between calls.
    char *buf_ptr = nullptr;
    std::size_t buf_size = 0;
    #endif

  public:
    constexpr Demangler() {}

    #ifndef _MSC_VER
    Demangler(Demangler &&other) noexcept
        : buf_ptr(other.buf_ptr), buf_size(other.buf_size)
    {
        other.buf_ptr = nullptr;
        other.buf_size = 0;
    }

    Demangler &operator=(Demangler other) noexcept
    {
        // Using the copy&swap idiom.
        std::swap(buf_ptr, other.buf_ptr);
        std::swap(buf_size, other.buf_size);
        return *this;
    }

    ~Demangler()
    {
        #ifndef _MSC_VER
        std::free(buf_ptr); // Does nothing if `buf_ptr` is null.
        #endif
    }
    #endif

    // Demangles the `typeid(...).name()` if necessary, or returns `name` as is if this platform doesn't mangle the type names.
    // The resulting pointer is either owned by this `Demangler` instance (and is reused on the next call), or is just `name` as is.
    // So for portable use ensure that both the `Demangler` instance and the `name` stay alive as long as you use the resulting pointer.
    // Returns null on demangling failure, but I've never seen that happen.
    [[nodiscard]] const char *operator()(const char *name)
    {
        #ifdef _MSC_VER
        return name;
        #else
        int status = -4; // Some custom error code, in case `abi::__cxa_demangle()` doesn't modify it at all for some reason.
        buf_ptr = abi::__cxa_demangle(name, buf_ptr, &buf_size, &status);
        if (status != 0) // -1 = out of memory, -2 = invalid string, -3 = invalid usage
            return nullptr; // Unable to demangle.
        return buf_ptr;
        #endif
    }
};

And now you can do this:

Demangler d;
std::cout << d(typeid(int).name()) << '\n'; // Prints `int`.

This class is written to return the string unchanged on MSVC, so you can use it unconditionally on all compilers.

Demangled names on different compilers

The names are somewhat portable across compilers, but there are differences.

First of all, MSVC likes to add struct/class/union/enum prefixes. For example, for class MyClass {}; you get:

  • "MyClass" on GCC/Clang.
  • "class MyClass" on MSVC.

You can easily remove those:

// Replace `a` with `b` in `source`, return the new string.
[[nodiscard]] std::string Replace(std::string_view source, std::string_view a, std::string_view b)
{
    std::string ret;

    std::size_t cur_pos;
    std::size_t last_pos = 0;

    while ((cur_pos = source.find(a, last_pos)) != std::string::npos)
    {
        ret.append(source, last_pos, cur_pos - last_pos);
        ret += b;
        last_pos = cur_pos + a.size();
    }

    ret.append(source, last_pos);
    return ret;
}

// On MSVC, removes the prefixes `struct`/`class`/`union`/`enum` from the string. Elsewhere returns it unchanged.
[[nodiscard]] std::string AdjustType(std::string_view type)
{
    #ifdef _MSC_VER
    std::string ret = Replace(type, "class ", "");
    ret = Replace(ret, "struct ", "");
    ret = Replace(ret, "union ", "");
    ret = Replace(ret, "enum ", "");
    return ret;
    #else
    // This should be just `std::string(type)`, but Clang is bugged: https://github.com/llvm/llvm-project/issues/161071
    return std::string(type.data(), type.data() + type.size());
    #endif
}

And now finally, std::cout << AdjustType(Demangler{}(typeid(MyClass).name())) << '\n'; prints MyClass on all compilers.

This is good enough to give portable names for simple classes, but there are still differences in more complex cases, e.g. templates. For example, std::string gives:

  • On GCC/Clang: "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >".
  • On MSVC: "std::basic_string<char,std::char_traits<char>,std::allocator<char> >"

As you can see, there are still some differences: __cxx11, and spaces after ,s. And it’s a bit sad that this doesn’t just print "std::string".

Limitations and unique features of typeid

And lastly, typeid has certain quirks/properties that should be kept in mind:

  • typeid ignores cvref (reference-ness, constness and volatile) of the type. E.g. typeid(const int &).name() is same as typeid(int).name().

  • typeid can be used at runtime to determine the true type1 of a polymorphic2 object. For example:

    struct A {virtual ~A() = default;}; // Need at least one virtual function to make `A` polymorphic.
    struct B : A {};
    
    A *p = new B;
    std::cout << typeid(*p).name() << '\n'; // Same as `typeid(B).name()`.
    

    This is something that only typeid can do, and not the alternative approach that’s described next.

    1 More correctly called its “dynamic type”.
    2 A polymorphic class is a class that has at least one virtual function.

At compile-time, using function names

There’s a second, completely different approach to getting the names:

#include <iostream>
#include <source_location>
#include <string_view>

template <typename T>
const char *RawTypeName()
{
    return std::source_location::current().function_name();
}

int main()
{
    std::cout << RawTypeName<int>() << '\n';
}

std::source_location::current().function_name() returns the current function name. Turns out that when used in templates, it happens to return their stringified template arguments as well, which we can extract the types from.

The above prints:

  • On MSVC: const char *__cdecl RawTypeName<int>(void)
  • On GCC: const char* RawTypeName() [with T = int]
  • On Clang: const char *RawTypeName() [T = int] (notably it gives almost the same string when in MSVC-compatible mode too, which is unlike typeid(T).name(), where it imitates MSVC)

As you can see, int is included in all those strings, but in different places.

Note that std::source_location is a relatively new addition to the language (added in C++20). Before we had that, we used to use non-standard compiler extensions, like this:

template <typename T>
const char *RawTypeName()
{
    #ifdef _MSC_VER
    return __FUNCSIG__;
    #else
    return __PRETTY_FUNCTION__;
    #endif
}

This being non-standard isn’t actually a big deal, because strictly speaking .function_name() isn’t guaranteed to include the template argument names either. For both of those, if it works in practice (it does), it’s good enough.

Extracting type names from the strings

How do we extract "int" from "const char *RawTypeName() [T = int]"?

Some libraries hardcode the amount of characters that need to be removed from those those strings (from both sides) to get just the name, but this is error-prone. I’ve seen at least one library that did this on GCC, Clang, and MSVC, but forgot to test Clang’s behavior in MSVC-compatible mode.

A better solution is to determine the amount of characters to remove automatically.

This is simple: Take a type with a known name, such as int. Find this name in RawTypeName<int>(), remember its position. Then use this position to find the names in other strings returned by RawTypeName<...>(). Like this:

class MyClass {};

int main()
{
    std::string_view int_name = RawTypeName<int>();
    std::size_t prefix_len = int_name.rfind("int");
    std::size_t suffix_len = int_name.size() - 3 - prefix_len;

    std::string_view myclass_name = RawTypeName<MyClass>();
    std::cout << myclass_name << '\n'; // Prints `const char* RawTypeName() [with T = MyClass]` in Clang.

    myclass_name.remove_prefix(prefix_len);
    myclass_name.remove_suffix(suffix_len);
    std::cout << myclass_name << '\n'; // Prints just `MyClass` or `class MyClass` in MSVC.
}

Variance between compilers

The type names reported by this method can vary wildly between compilers, and even on the same compiler they can depend on which C++ standard library implementation is used.

For int and class MyClass {};, it behaves the same as the typeid-based approach, meaning that on MSVC only, the latter reports "class MyClass" (notably Clang in MSVC-compat mode omits class, unlike when using typeid).

But if we take std::string, more differences show up:

  • GCC reports "std::__cxx11::basic_string<char>". It has removed the redundant template arguments (which have their default values anyway).

  • Clang normally reports "std::basic_string<char>", so it has further removed the redundant namespace __cxx11 (because it is inline).

    But it can also report just "std::string" if you’re using libc++ (the Clang’s own implementation of the C++ standard library, because it has special annotations on std::string, namely __attribute__((__preferred_name__(...)))).

  • MSVC reports "class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >", which is the exact same string as it returns from typeid(std::string).name().

    So it’s recommended to call AdjustType() from earlier on those strings.

Extracting the type names at compile-time

The code snippet above can be evaluated at compile-time, and will probably be optimized by the compiler to e.g. compute prefix_len and suffix_len at compile-time. (You can even make them constexpr, if you also do so to int_name and RawTypeName().)

The problem is that unless you go our of your way, the original long strings like "const char *RawTypeName() [T = int]" will still show up in the compiled binary. Which probably isn’t that big of a deal, but it is uncool.

With some constexpr magic, it’s possible to avoid this entirely:

// This reuses `RawTypeName()`, `Replace()`, and `AdjustType()` from earlier.
// All of them need to be marked `constexpr`.

struct RawTypeNameFormat
{
    std::size_t prefix_len = 0;
    std::size_t suffix_len = 0;
};

constexpr RawTypeNameFormat raw_type_name_format = []{
    std::string_view int_name = RawTypeName<int>();
    RawTypeNameFormat ret;
    ret.prefix_len = int_name.rfind("int");
    ret.suffix_len = int_name.size() - 3 - ret.prefix_len;
    return ret;
}();

template <typename T>
constexpr std::string TypeNameLow()
{
    std::string_view name = RawTypeName<T>();
    name.remove_prefix(raw_type_name_format.prefix_len);
    name.remove_suffix(raw_type_name_format.suffix_len);
    return AdjustTypeConst(name);
}

template <typename T>
constexpr auto type_name_array = []{
    constexpr std::size_t n = TypeNameLow<T>().size();
    std::string raw = TypeNameLow<T>(); // The function must be called twice, for obscure C++ reasons, at least at the time of writing this.
    std::array<char, n + 1> ret{};
    for (std::size_t i = 0; i < n; i++)
        ret[i] = raw[i];
    return ret;
}();

template <typename T>
[[nodiscard]] constexpr std::string_view TypeName()
{
    const auto &ret = type_name_array<T>;
    return {ret.data(), ret.data() + ret.size() - 1};
}

// ---

class MyClass {};

int main()
{
    std::cout << TypeName<MyClass>() << '\n';
}

This prints MyClass on all three compilers. I’m sure there is some room to optimize the compilation speed here.

If you look at the assembly output on godbolt, you’ll see that indeed, the only string in the binary is "MyClass", there’s no "const char *RawTypeName() [T = int]".

The cppdecl library

As mentioned at the beginning of the post, I’ve made a library called cppdecl to simplify obtaining type names.

It wraps either typeid(T).name() or the .function_name()-based approach, and then applies a large collection of rewrite rules to the result, to make the names as consistent as possible across compilers. Like in the example above, all those transformations are performed at compile-time.

Reflection

And lastly…

C++26 finally got a feature called reflection, which among other things should make obtaining type names easier.

At the time of writing this, it’s not implemented in any of the major compilers yet. All we have is an experiemental Clang fork that supports it.

Here’s a small test program that you can run on that fork: (godbolt link)

#include <iostream>
#include <meta>
#include <string_view>
#include <string>

template <typename T>
[[nodiscard]] consteval std::string_view TypeName()
{
    return std::meta::display_string_of(^^T);
}

int main()
{
    std::cout << TypeName<std::string>() << '\n'; // basic_string<char, char_traits<char>, allocator<char>>
    std::cout << std::meta::display_string_of(^^std::string) << '\n'; // string
}

The standard doesn’t specify the exact names that should be returned, but we can already see that:

  1. You can pass typedefs without expanding them, which wasn’t possible before.
  2. The namespaces are omitted from the name.

(2) looks concerning, but I expect that either this will change, or there will be an alternative way of getting those namespaces.

 

Thanks for reading!