In this second and final part of the series, we will go through the exact flow CreateProcess
carries out to launch a process on Windows using the APIs and Data Structure we discussed in Part 1.
The creation of a Windows process (subsystem-specific) consists of several stages carried out in three sections of the Operating System: the Windows client-side library: Kernel32.dll, the Windows executive, and the Windows subsystem process.
The operations performed in each section is described through the diagram shown below:
CreateProcessInternalW
performs the following steps:
The priority class for the new process is specified as independent bits in the CreationFlags
parameter to the CreateProcess* functions.
* Idle/Low (4)
* Below Normal (6)
* Normal (8)
* Above Normal (10)
* High (13)
* Real-Time (24)
Defaults to normal, if none specified.
Process creation won't fail if a Real-Time priority class is specified for the new process, whilst not having the Increase Scheduling Priority Privilege (SE_INC_BASE_PRIORITY_NAME), the High Priority class is used instead.
For a debug flag, Kernel32 will initiate a connection to the native debugging code in Ntdll.dll
by calling DbgUiConnectToDbg
and gets a handle to the debug object from the current TEB
.
Supports multiple user-specified attributes. The attribute list passed on CreateProcess* calls permits passing back to the caller information beyond a simple status code, such as the TEB
address of the initial thread, etc. (Important for protected processes | No query post creation)
If the process is part of a job object, but the creation flags requested a separate virtual DOS machine (VDM), the flag is ignored.
CreateProcessInternalW
checks whether the process should be created as modern ( with attributes: PROC_THREAD_ATTRIBUTE_ PACKAGE_FULL_NAME, PROC_THREAD_ATTRIBUTE_PARENT_PROCESS). If so, a call is made to the internal BaseAppXExtension
to gather more contextual information on the modern app parameters described by a structure called APPX_PROCESS_CONTEXT.
If the process is to be created as modern, the security capabilities (PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES) are recorded for the initial token creation by calling the internal BasepCreateLowBox
function.
If a modern process is created, then a flag is set to indicate to the kernel to skip embedded manifest detection. It's not needed in a modern process, it is already embedded.
If the debug flag has been specified, then the Debugger
value under the Image File Execution Options registry key for the executable is marked to be skipped.
If no desktop is specified in the STARTUPINFO
structure, the process is associated with the caller's current desktop.
The application and command-line arguments passed to the API are analyzed and converted to the internal NT name if required.
Most of the processed information is converted to a single large structure of type RTL_USER_PROCESS_PARAMETERS
.
After completion of the previous steps, a call to NtCreate-UserProcess
to attempt the creation of the process.
Continuing the NtCreateUserProcess
system call in the kernel-mode:
NtCreateUserProcess
first validates arguments and builds an internal structure to hold all creation information for validation and security intent.NtCreateUserProcess
tries to open the file and create a section object for it. The object isn't mapped into memory yet, but it is opened.NtCreateUserProcess
looks in the registry under HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options
to see whether a subkey with the file name and extension of the executable image exists there. If it does, PspAllocate-Process
look for a value named Debugger for that key.CreateProcessInternalW
tries to find a support image to run it.Next step, create a Windows Executive process object to run the image by calling the internal system function PsPAllocateProcess
.
InheritedFromUniqueProcessId
field in the new process object.IFEO
key to check whether the process should be mapped with large pages (!exception: WoW64 [WoW64 Auxillary Structure EWOW64PROCESS
]). Query performance option key in IFEO: PerfOptions
> IoPriority, PagePriority, CpuPriorityClass, WorkingSetLimitInKB.
SeAssignPrimaryToken
privilege)CreateProcessAsUser
, this step won't occur. Instead, the default quota is created, or a quota matching the user's profile is selected.PspMinimumWorkingSet
and PspMaximumWorkingSet
respectively.IFEO
key are read and set.The initial process address space consists of page directory, hyperspace page, VAD Bitmap page, working set list.
MmTotalCommittedPages
and added to MmProcessCommit
.MmResidentAvailablePages
.The next stage of PspAllocateProcess
is the initialization of the KPROCESS
structure. This task is executed by KeInitializeProcess
which does the following:
HvlCreateSecureProcess.
Handled mostly through MmInitializeProcess-AddressSpace
(supports process cloning: yeah you little forkers)
PEB
is created and initialized. Ntdll.dll
is mapped into the process. (32-bit Ntdll.dll
for WoW64 processes) A new session, if requested, is now created for the process. The standard handles are duplicated and the new values are written in the process parameters structure. Any memory reservations listed in the attribute list are now processed. Additionally, two flags allow the bulk reservation of the first 1 or 16 MB of the address space.MinWin
API redirection set is mapped into the process and its pointer is stored in the PEB.PspCidTable
that is not associated with any process.NtCreateUserProcess
calls MmCreatePEB
which first maps the system-wide National Language Support tables into the process's address space.MiCreatePebOrTeb
to allocate a page for the PEB
and then initializes a number of fields such as MmHeap*
values, MmCriticalSectionTimeout
and MmMinimumStackCommitInBytes
.IMAGE_FILE_UP_SYSTEM_ONLY
flag is set, a single CPU is chosen for all the threads in this new process to run on. The selection process is performed by simply cycling through the available processors.Before the handle to the new process can be returned, few final steps are performed by PspInsertProcess
and its helper functions:
PsActive-ProcessHead
which makes it accessible via functions like EnumProcesses
and OpenProcess
.NoDebugInherit
flag is set.PspInsertProcess
creates a handle for the new process by calling ObOpenObjectByPointer
, and then returns this handle to the caller.PspCreateThread
routine is responsible for all aspects of thread creation and is called by NtCreateThread
when a new thread is being created. The helper routines PspInsertThread
handle the actual creation and initialization of the executive thread object and PspInsertThread
handles the creation of the thread handle and security attributes and the call to KeStartThread
to turn the executive object into a schedulable thread on the system. The following two charts explain the flow of PspAllocateThread
and PspInsertThread
.
Once NtCreateUserProcess
returns with a success code, CreateProcessInternalW
then performs Windows subsystem-specific operations to finish initializing the process. If software restriction policies dictate, a restricted token is created for the new process. Afterward, the application-compatibility database is queried to see whether an entry exists in either the registry or system application database for the process.
CreateProcessInternalW
acquires SxS
information such as manifest files and DLL redirection paths, and other flags. A message to the Windows subsystem is constructed based on the information collected to be sent to Csrss
containing the following information:
Further, the windows subsystem performs the following steps:
CsrCreateProcess
duplicates a handle for the process and thread. CSR_PROCESS
structure is allocated.CSR_PROCESS
.CSR_THREAD
is allocated and initialized. CsrCreateThread
inserts the thread in the list of threads for the process.0x280
, the default process shutdown level. The new Csrss
process structure is inserted into the list of Windows subsystem-wide processes.At this point, the process environment has been determined, resources for its
threads to use have been allocated, the process has a thread, and the Windows subsystem knows about the new process. Unless the caller specified the CREATE_SUSPENDED
flag, the initial thread is now resumed so that it can start running and perform the remainder of the process-initialization work that occurs in the context of the new process.
The new thread begins life running the kernel-mode thread startup routine KiStartUserThread
. KiStartUserThread
lowers the thread's Interrupt Request Level (IRQL) from deferred procedure call (DPC) level to the asynchronous procedure call (APC) level and then calls the system initial thread routine, PspUserThreadStartup
. The user-specified thread start address is passed as a parameter to this routine. PspUserThreadStartup
performs the following actions:
PASSIVE_LEVEL
(0). It disables the ability to swap the primary access token at runtime.TEB
. It calls DbgkCreateThread
, which checks whether image notifications were sent for the new process. If they weren't, and notifications are enabled, an image notification is sent first for the process and then for the image load of Ntdll.dll
. {since no PID (required for kernel callouts) allocated at that time}SharedUserData
structure has been set up. If it hasn't, it generates it based on a hash of system information such as the number of interrupts processed, DPC deliveries, page faults, interrupt time, and a random number. This systemwide cookie is used in the internal decoding and encoding of pointers, such as in the heap manager to protect against certain classes of exploitation.HvlStartSecureThread
that transfers control to the secure kernel to start thread execution. This function only returns when the thread exits.LdrInitialize-Thunk
in Ntdll.dll
, as well as the system-wide thread startup stub RtlUserThreadStart
in Ntdll.dll
. The LdrInitializeThunk
routine initializes the loader, the heap manager, NLS tables, thread-local storage (TLS) and fiber-local storage (FLS) arrays, and critical section structures. It then loads any required DLLs and calls the DLL entry points with the DLL_PROCESS_ATTACH
function code.Once the above function returns, NtContinue
restores the new user context and returns to user mode. Thread execution now finally starts. RtlUserThreadStart
uses the address of the actual image entry point and the start parameter and calls the application's entry point. These two parameters have also already been pushed onto the stack by the kernel.
This complicated series of events has two purposes:
Ntdll.dll
to set up the process internally and behind the scenes so that other user-mode code can run properly.Ntdll.dll
is aware of that and can call the unhandled exception filter inside Kernel32.dll
. It is also able to coordinate thread exit on return from the thread's start routine and to perform various cleanup work.This concludes the creation and startup of a Windows process.