Articles / Windows' API sets — analysis

Windows' API sets — analysis

by thatbakamono on 2025-07-02

§ Introduction

Microsoft introduced the concept of API sets with the release of Windows 7 and expanded their implementation in Windows 8. This initiative aimed to modularize the operating system, making it more flexible and portable. The effort was driven by Microsoft's interest in supporting a broader range of platforms beyond traditional PCs using the same — or at least mostly the same — operating system, as some implementation details would inherently have to differ.
Here's what the official documentation says about them:
All versions of Windows share a common base of operating system (OS) components that's called the core OS (in some contexts this common base is also called OneCore). In core OS components, Win32 APIs are organized into functional groups called API sets.
The purpose of an API set is to provide an architectural separation from the host DLL in which a given Win32 API is implemented, and the functional contract to which the API belongs. The decoupling that API sets provide between implementation and contracts offers many engineering advantages for developers. In particular, using API sets in your code can improve compatibility with Windows devices.
Which sounds pretty cool and useful so far but there’s an issue with a rather scary name — backwards compatibility. Microsoft needed to introduce a way to support this new concept without breaking decades of existing software.
In other words, they needed to:
  • Allow linking against umbrella libraries (e.g. OneCore.lib), which internally reference api-ms-* DLLs via a mechanism we'll discover later.
  • Support linking against original libraries like kernel32.dll, so that software written before the API set era continues to work seamlessly.
At the same time, they wanted the flexibility to implement functions wherever needed. Hence the mechanism we'll discover below was developed.

§ Analysis

Let's start by taking a look at the good ol' kernel32.dll in IDA — that's probably gonna tell us something.
We can see lots of functions with the Stub suffix and very few without — that's definitely suspicious. What's going on here?
alt
Let's pick one at random and find out. I'll go with GetLastError, for no particular reason other than that I like it. It's a simple and commonly used function.
We can see that it's exported as GetLastError without the Stub suffix — definitely for compatibility — and that it does nothing besides calling __imp_GetLastError. The name didn't lie; it really is just a stub.
alt
What's __imp_GetLastError though? We haven't seen that function yet, so let's quickly take a peek.
alt
It seems to be part of the imports from api-ms-win-core-errorhandling-l1-1-0.dll. There's no actual implementation of GetLastError in the kernel32.dll though. Seems like it's one of those functions they wanted to move somewhere else.
Well, what are we waiting for? Let's take a look at the api-ms-win-core-errorhandling-l1-1-0, but first we need to find it.
Well... Where is it? As it turns out, this particular one is located in C:\Windows\System32\downlevel, which is actually where most of those api-ms-* libraries are found on PCs, though it's not a strict rule. Some may be located elsewhere. Especially those less common or feature-locked ones.
alt
Also, I'll mention as a side note that DLL loading order still applies here. When an application requests a DLL, Windows follows a specific search order to locate it. This includes checking the application's directory first, then system directories, and finally any paths listed in the environment variables. You can see this being used below.
alt
Anyway, back to the main topic — the first thing we'll see after opening the api-ms-win-core-errorhandling-l1-1-0 is that there are no symbols whatsoever. That's a little bit concerning, if I'm being honest.
alt
Oh, but there are exports! Let’s have a quick look at them.
alt
Uh oh. There's something called a forwarder and all of them seem to be pointing back at the kernel32! That's infinite recursion, isn't it? What the heck is going on here?
It turns out those libraries are a little bit special. They are not supposed to be treated like regular libraries. You'll never ever resolve functions by doing so. Instead, there's a mapping of those empty API set libraries to the actual libraries with real implementations.
This mapping is created at boot time according to the Windows/System32/apisetschema.dll and also some additional schemas defined in the registry at Computer/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Control/Session Manager/ApiSetSchemaExtensions.
alt
Each of them has a section called .apiset, which defines how those API set libraries are supposed to be translated to the actual libraries.
alt
After creation, the mapping resides in the PEB (process environment block). Let's take a look what the documentation says about it, if there's any.
alt
That's... a lot of reserved fields and nothing that resembles the mappings we are looking for. The documentation will unfortunately not help us here. We'll have to dig deeper. Let's take a look at what WinDbg says.
alt
There's an ApiSetMap at 0x68 (which corresponds to reserved9[0]). That sounds exactly like what we're looking for. Unfortunately, there are no symbols for it anymore. They used to be present around Windows 7 but were removed later. This actually happened quite a few times across different parts of the OS — for example Microsoft Defender had publicly available symbols until around 2018. As of right now, the structure looks more or less like this:
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
 
struct ApiSetHeader {
    DWORD Version;
    DWORD Size;
    DWORD Flags;
    DWORD Count;
    DWORD NamespaceEntriesOffset;
    DWORD HashEntriesOffset;
    DWORD HashMultiplier;
};
 
struct ApiSetHashEntry {
    DWORD Hash;
    DWORD IndexInEntriesNamespace;
};
 
struct ApiSetNamespaceEntry {
    DWORD Flags;
    DWORD NameOffset;
    DWORD SizeOfApiSetName;
    DWORD SizeOfApiSetNameEx; // without hyphens
    DWORD ValueEntriesArrayOffset;
    DWORD HostsNumber;
};
 
struct ApiSetValueEntry {
    DWORD Flags;
    DWORD NameOffset;
    DWORD NameLength;
    DWORD ValueOffset;
    DWORD ValueLength;
};
ApiSetHeader is what's found at PEB->ApiSetMap. We can see it has two offsets — one pointing to hash entries and the other to namespace entries. The hash entries are what we’ll be interested in first, since they lead to the namespace entries, as we can see by looking at the structure.
Since there are no names anywhere to be seen, to resolve a library we'll first need to compute its hash, which is done as follows:
#include <ctype.h>
 
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
 
static DWORD CalculateHash(const PWSTR string, const DWORD length, const DWORD hashMultiplier) {
    DWORD hash = 0;
 
    for (auto i = 0; i < length; i++) {
        hash *= hashMultiplier;
        hash += tolower(String[i]);
    }
 
    return hash;
}
where string is the library name without the extension and the last version digit (so for example api-ms-win-security-base-ansi-l1-1, not api-ms-win-security-base-ansi-l1-1-0 nor api-ms-win-security-base-ansi-l1-1-0.dll), and hashMultiplier is simply the value of ApiSetHeader->HashMultiplier.
After we've computed the hash, we essentially have everything we need to perform the lookup. The lookup consists of finding the correct ApiSetHashEntry by iterating over the entries and comparing hashes. This entry type doesn’t contain anything directly useful on its own, but it includes an index into the namespace entries — and that’s where things get a little bit more interesting. Each namespace entry includes a Name, which is the stub library name without the extension. While this isn’t helpful for the lookup itself, it can be useful for debugging. We'll use this information very soon. The most important fields are ValueEntriesArrayOffset and HostsNumber. The first tells us where the array of ApiSetValueEntry begins (relative to ApiSetHeader), and the second tells us how many entries are in that array. They’re referred to as hosts because they represent the different possible implementations of the library.
As a side note — The majority of libraries have only a single "host", but there are some with multiple ones:
api-ms-win-core-io-l1-1-1 -> kernel32.dll
api-ms-win-core-io-l1-1-1 -> kernelbase.dll
api-ms-win-core-processthreads-l1-1-7 -> kernel32.dll
api-ms-win-core-processthreads-l1-1-7 -> kernelbase.dll
api-ms-win-core-appinit-l1-1-0 -> kernel32.dll
api-ms-win-core-appinit-l1-1-0 -> kernelbase.dll
api-ms-win-core-util-l1-1-1 -> kernel32.dll
api-ms-win-core-util-l1-1-1 -> kernelbase.dll
api-ms-win-core-processsecurity-l1-1-0 -> kernel32.dll
api-ms-win-core-processsecurity-l1-1-0 -> kernelbase.dll
ext-ms-win-kernel32-errorhandling-l1-1-0 -> kernel32.dll
ext-ms-win-kernel32-errorhandling-l1-1-0 -> faultrep.dll
ApiSetHashEntry entries are sorted by hash in ascending order, which allows implementations to perform binary search on them.
That’s enough delving into the details — let’s use this information to print all the records and see what’s actually there.
int main()
{
    const auto *teb = reinterpret_cast<ThreadEnvironmentBlock*>(NtCurrentTeb());
    const auto *peb = teb->ProcessEnvironmentBlock;
 
    size_t map_address = reinterpret_cast<size_t>(peb->ApiSetMap);
    const auto *header = static_cast<ApiSetHeader*>(peb->ApiSetMap);
 
    const auto *hash_entries = reinterpret_cast<ApiSetHashEntry*>(map_address + header->HashEntriesOffset);
    const auto *namespace_entries = reinterpret_cast<ApiSetNamespaceEntry*>(map_address + header->NamespaceEntriesOffset);
 
    for (size_t hash_entry_idx = 0; hash_entry_idx < header->Count; ++hash_entry_idx)
    {
        const auto *hash_entry = hash_entries + hash_entry_idx;
        const auto *namespace_entry = namespace_entries + hash_entry->IndexInEntriesNamespace;
 
        std::wstring_view input_wide_name { reinterpret_cast<wchar_t*>(map_address + namespace_entry->NameOffset), namespace_entry->SizeOfApiSetName / 2 };
        std::string input_name { input_wide_name.begin(), input_wide_name.end() };
 
        std::print("{}\n", input_name);
    }
 
    return 0;
}
Which results in:
api-ms-win-security-base-ansi-l1-1-0
ext-ms-win-kernel32-sidebyside-l1-1-0
api-ms-win-composition-redirection-l1-1-0
api-ms-win-ntuser-sysparams-l1-1-0
api-ms-win-core-commandlinetoargv-l1-1-0
ext-ms-win-core-resourcepolicyserver-l1-1-1
api-ms-win-net-isolation-l1-1-1
api-ms-win-core-psm-app-l1-1-0
api-ms-win-dx-d3dkmt-l1-1-7
ext-ms-win-rtcore-ntuser-keyboard-l1-1-0
ext-ms-win-core-win32k-common-inputrim-l1-1-0
ext-ms-win-ntos-werkernel-l1-1-1
api-ms-win-security-logon-l1-1-1
...
Great, it seems we're really close to actually resolving those libraries. Let's take a look at what information the ApiSetValueEntry provides. There are some Flags, whose purpose isn’t entirely clear to us, and two strings — once again represented as offset and length pairs.
If we inspect the Name string in a debugger, we'll just see a bunch of empty strings — entirely useless.
alt
But what about the Value string? Oh my days — that looks like exactly what we’re after!
alt
Let's combine all of this:
int main()
{
    const auto *teb = reinterpret_cast<ThreadEnvironmentBlock*>(NtCurrentTeb());
    const auto *peb = teb->ProcessEnvironmentBlock;
 
    size_t map_address = reinterpret_cast<size_t>(peb->ApiSetMap);
    const auto *header = static_cast<ApiSetHeader*>(peb->ApiSetMap);
 
    const auto *hash_entries = reinterpret_cast<ApiSetHashEntry*>(map_address + header->HashEntriesOffset);
    const auto *namespace_entries = reinterpret_cast<ApiSetNamespaceEntry*>(map_address + header->NamespaceEntriesOffset);
 
    for (size_t hash_entry_idx = 0; hash_entry_idx < header->Count; ++hash_entry_idx)
    {
        const auto *hash_entry = hash_entries + hash_entry_idx;
        const auto *namespace_entry = namespace_entries + hash_entry->IndexInEntriesNamespace;
 
        const auto *value_entries = reinterpret_cast<ApiSetValueEntry*>(map_address + namespace_entry->ValueEntriesArrayOffset);
 
        std::wstring_view input_wide_name { reinterpret_cast<wchar_t*>(map_address + namespace_entry->NameOffset), namespace_entry->SizeOfApiSetName / 2 };
        std::string input_name { input_wide_name.begin(), input_wide_name.end() };
 
        for (size_t value_entry_idx = 0; value_entry_idx < namespace_entry->HostsNumber; ++value_entry_idx)
        {
            const auto *value_entry = value_entries + value_entry_idx;
 
            std::wstring_view output_wide_name { reinterpret_cast<wchar_t*>(map_address + value_entry->ValueOffset), value_entry->ValueLength / 2 };
            std::string output_name { output_wide_name.begin(), output_wide_name.end() };
 
            if (namespace_entry->HostsNumber > 1)
                std::print("{} -> {}\n", input_name, output_name);
        }
    }
 
    return 0;
}
And we get:
api-ms-win-core-xstate-l1-1-3 -> ntdll.dll
api-ms-win-core-xstate-l2-1-2 -> kernelbase.dll
ext-ms-win-ntuser-private-l1-1-1 -> user32.dll
ext-ms-win-ntuser-private-l1-2-0 -> user32.dll
ext-ms-win-ntuser-private-l1-3-3 -> user32.dll
ext-ms-win-ntuser-private-l1-4-0 -> user32.dll
ext-ms-win-ntuser-private-l1-5-0 -> user32.dll
ext-ms-win-ntuser-private-l1-6-1 -> user32.dll
api-ms-win-crt-runtime-l1-1-0 -> ucrtbase.dll
api-ms-win-core-shutdown-ansi-l1-1-0 -> advapi32.dll
...
This means we’re finally able to resolve API set-style libraries to their actual implementations — and that wraps up this blog post. Hope to see you again soon. For now, I’m taking off!