In game development, program performance is a critical issue that needs to be prioritized. To cover the largest user base possible, it’s essential to consider the operational effects on mid to low-end devices and to ensure compatibility with a wide range of hardware configurations. In this context, analyzing and optimizing the performance bottlenecks in games becomes key.
Loading more resources into memory at runtime is essentially a space-for-time approach. Frequent disk I/O is very time-consuming; by preloading resources into memory, high-speed reading can be achieved. However, memory resources are also limited and cannot be used without restrictions. This is particularly significant for mid to low-end mobile devices, where devices with 4GB or even less memory still hold a considerable market share. Therefore, memory usage must be optimized, and excessive memory consumption can lead to being terminated by the system.
Memory optimization is fundamentally about seeking a balance between loading efficiency and memory usage. The goal is to satisfy the need for compatibility with more low-end devices while also maximizing the use of available memory to improve program efficiency without triggering out-of-memory (OOM) situations.
I plan to write several articles related to performance optimization, beginning with memory analysis in UE. This article will introduce commonly used memory analysis tools and methods, as well as compile memory optimization techniques applicable within UE projects. This information has been previously recorded in the form of notes at notes/ue, and future memory-related content will be supplemented to this article.
Memory optimization primarily focuses on the following four aspects:
- Identifying memory leaks
- Trimming unnecessary modules
- Optimizing memory usage of existing modules
- Lossy optimization: reducing content (asset quality, etc.)
Three steps: Find bugs, squeeze out excess, cut down requirements.
Memory Analysis Tools
Before performing memory optimization, it’s essential to gain a general understanding of memory distribution in the UE project. You can use the memory analysis tools provided by UE as well as some native platform analysis tools.
Memory analysis resources:
Some commonly used console commands:
1 | stat memory # Displays memory usage of various subsystems in the engine |
Enable LLM by adding the -LLM
parameter at startup:
1 | -LLM # Enable LLM |
Then you can use the following console commands at runtime:
1 | stat llm # Displays LLM summary. All lower-level engine statistics are aggregated into a single engine statistic. |
Memory analysis can also utilize the following tools:
- memreport
- MemoryProfiler
- Heapprofd (Android)
- Instrument (IOS)
Entering memreport
(-full) in the game console will create a directory map and a .memreport
file in Saved/Profiling/Memreports
, which can be opened with a text editor to observe memory usage in various parts of the game.
Specific usage of memory analysis tools and the analysis process of memory allocation in the UE engine will be detailed later when time allows. For information about LLM, please refer to the UE documentation.
Memory Optimization Strategies
The optimization methods listed below are optional; it is not necessary to implement all of them for the best results. Since memory optimization must balance efficiency, features can be selectively optimized based on project needs across different devices, ensuring compatibility while minimizing resource consumption on low-end devices.
Here are the main areas in UE that can be optimized, along with suggestions on how to do so. Specific optimization data will be analyzed and supplemented as time permits.
Disable Unnecessary Feature Support
Engine module support can be trimmed based on needs:
- APEX: If the Nvidia APEX destruction system is not in use, APEX support can be removed during engine compilation. Set
bCompileAPEX=true
in BuildSetting or TargetRules. - Recast (NavMesh): If the client does not need Recast support at runtime and does not require local NavMesh navigation operations, NavMesh support can be trimmed at runtime by setting
bCompileRecast=false
in BuildSetting or TargetRules. - FreeType: If FreeType font support is not needed, set
bCompileFreeType=false
in BuildSetting or TargetRules. - ICU (unicode/i18n): The engine’s Core module supports unicode/i18n. Set
bCompileICU=false
in BuildSetting or TargetRules. - CompileForSize: A UE optimization option that strictly controls size during compilation but sacrifices performance. Set
bCompileForSize=false
in BuildSetting or TargetRules. - CEF3: Optional support for the Chromium Embedded Framework, Google’s embedded browser. Set
bCompileCEF3=false
in BuildSetting or TargetRules. - bUsesSteam: Whether or not to use Steam; this can be disabled for mobile games, controlled through
bUsesSteam
in TargetRules. - SpeedTree: If SpeedTree for vegetation modeling is not required, compilation of SpeedTree can be disabled; control via
bOverrideCompileSpeedTree
in TargetRules. - Audio module: If the project uses WWise or others as the audio playback interface, and the built-in Audio module is not needed, this functionality is redundant and can be trimmed.
- Internationalization module: If the game’s multilingual support does not rely on UE’s text collection and translation features, this module can be trimmed.
This can reduce the size of the statically compiled program and eliminate unnecessary execution logic.
Control AssetRegistry Serialization
AssetRegistry is mainly used in the Editor to facilitate resource lookup and filtering operations, primarily used by ContentBrowser, as described in the UE documentation: Asset Registry.
For projects, there might not be a need for it at Runtime, but as soon as the AssetRegistry
module is started, AssetRegistry.bin
is loaded into memory. If not needed, this part of the memory is wasted.
Fortunately, UE provides methods to avoid or partially serialize AssetRegistry data. The constructor of UAssetRegistryImpl
calls the InitializeSerializationOptionsFromIni
function to read configuration from DefaultEngine.ini
and constructs an FAssetRegistrySerializationOptions
structure to store this. It is then used in a subsequent Serialize
function to control which data gets serialized into AssetRegistry
.
1 | // Code omitted for brevity |
This control mechanism can determine whether to generate AssetRegistry.bin
at packaging time and control which AssetRegistry data is deserialized at runtime (but will not affect DevelopmentAssetRegistry.bin
, which can be used for asset auditing).
Its deserialization process is as follows:
- Check
bSerializeAssetRegistry
; iftrue
, loadAssetRegistry.bin
into memory in binary form. - Use the
Serialize
function to deserialize the binary data. - Release the memory occupied by the loaded
AssetRegistry.bin
.
Thus, AssetRegistry’s memory consumption relates to the data serialized, while FAssetRegistrySerializationOptions
controls what data is serialized.
1 | // Code omitted for brevity |
Creating an AssetRegistry
Section in Config/DefaultEngine.ini
using the above names can control the serialization of AssetRegistry, reducing package size and memory consumption (since AssetRegistry is loaded into memory when the engine starts).
1 | [AssetRegistry] |
You can also specify it for a specific platform by modifying the platform-related INI files:
1 | Config/Windows/WindowsEngine.ini |
Load Only Quality Levels of Shaders Used
By default, the engine loads shaders for all quality levels into memory. In situations where quality switching is unnecessary, unused quality levels can be left unpopulated, reducing shader memory consumption.
In Project Settings
- Engine
- Rendering
- Materials
- Game Discards Unused Material Quality Levels
:
Alternatively, add the following configuration in DefaultEngine.ini
:
1 | [/Script/Engine.RendererSettings] |
When running in game mode, whether to keep shaders for all quality levels in memory or only those needed for the current quality level.
- Unchecked: Keep all quality levels in memory allowing a runtime quality level change. (default)
- Checked: Discard unused quality levels when loading content for the game, saving some memory.
Reduce Shader Variants
You can reduce the number of material instances and enable the following options in Project Settings
- Engine
- Rendering
:
Share Material Shader Code
When packaging, you can reduce package size and memory usage by configuring Share Material Shader Code
and Shared Material Native Libraries
in Project Settings
- Packaging
(this may increase loading time).
1 | /** |
Upon enabling this, the packaged output will include the following files:
1 | ShaderArchive-Blank425-PCD3D_SM5.ushaderbytecode |
However, if changes are made to the shader code in subsequent cooking, while the basic packaged code still holds old shader bytecode information, it could lead to loss of materials.
There are three approaches:
- During later packaging, include the Shader bytecode files in the pak and load on mount;
- Package shader bytecode with resources during hot updating;
- Create a Shader Patch and load it after updates.
For detailed processes on updating shader bytecode, refer to my previous article UE4 Hot Update: Create Shader Patch.
Disable UMG Template Creation
The engine has a feature that caches blueprint controls for accelerated creation, but it can lead to memory waste. This can be configured to be disabled:
Alternatively, modify the engine code to provide default values using class-level initialization:
1 | UPROPERTY(EditAnywhere, AdvancedDisplay, Category=WidgetBlueprintOptions, AssetRegistrySearchable) |
This variable is checked and utilized in the following code:
1 | // Code omitted for brevity |
Disable PakCache
The engine by default enables the PakCache mechanism, which reads files from Pak while utilizing additional memory for caching. The memory usage can be significant (viewed via stat memory
):
Game startups will log PakCache details:
1 | [2021.03.23-10.49.21:354][445]LogPakFile: Precache HighWater 16MB |
You can configure it to disable as follows:
1 | [ConsoleVariables] |
Disabling PakCache may lead to frequent I/O issues, but specific performance impacts will require further analysis when time permits.
Unload pakentry filenames
Starting from UE4.23, the engine offers memory optimization configurations for mounted PakFiles:
1 | [Pak] |
When FPakPlatformFile
executes Initialize
, it binds FCoreDelegates::OnOptimizeMemoryUsageForMountedPaks
, which calls this delegate to notify PakPlatformFile
to optimize the memory usage for mounted Paks.
1 | // Code omitted for brevity |
UnloadPakEntryFilenamesIfPossible
: Allows unloading memory occupied by PakEntry filenames.DirectoryRootsToKeepInMemoryWhenUnloadingPakEntryFilenames
: Directories to keep in memory when unloading PakEntry filenames.bShrinkPakEntriesMemoryUsage
: Shrinks memory usage of PakEntry.
After this is invoked, if UnloadPakEntryFilenamesIfPossible
is enabled, memory will be saved by hashing the filename list in Pak, but wildcard matching for paths will no longer be available.
1 | // Code omitted for brevity |
Compress Textures
Texture compression is lossy and can reduce both package size and memory size upon loading. Even with lossy compression, the quality reduction on mobile is often not noticeable and can be set based on project needs.
In previous notes, it was mentioned you can adjust the default resource quality and size level in Project Settings
- Cooker
- Texture
- ASTC Compression vs Size
:
1 | 0=12x12 |
In the Texture resource editor, you can also set this for individual Textures:
Lowest->Highest corresponds to values 0-4
, using Default will apply project settings.
Also, the type set in Compression Settings
will affect the compression type of the resource. Default implies project settings; if set as NormalMap type, it will be ASTC_4x4
.