In game development, program performance is a crucial issue that needs to be emphasized. To cover the largest possible user base, we must consider the operational effects on mid to low-end devices and ensure compatibility with a wide range of hardware specifications. In this context, it is key to analyze and optimize the performance bottlenecks of the game.
Loading more resources into memory at runtime is essentially a trade-off of space for time. Frequent I/O from the disk is quite time-consuming, and pre-loading resources into memory allows for fast retrieval. However, memory resources are also limited and cannot be used without restrictions, especially for some mid to low-end mobile devices where devices with 4GB or even less memory still have a significant market share. Therefore, it is important to avoid wasting memory, as excessive memory usage may lead to termination by the system.
Memory optimization fundamentally seeks to strike a balance between loading efficiency and memory consumption, aiming to use as much available memory as possible while ensuring that more low-end devices run normally without triggering OOM.
I plan to write a few articles related to performance optimization. This article will start with UE memory analysis, introducing commonly used memory analysis tools and methods, as well as organizing the memory optimization techniques that can be applied in UE projects. This part was previously recorded in the form of notes at notes/ue, and subsequent memory-related content will be added to this article.
Memory optimization primarily focuses on the following four aspects:
- Identifying memory leaks
- Trimming unnecessary modules
- Optimizing the memory usage of existing modules
- Lossy optimization: cutting content (material quality, etc.)
A three-step approach: debug, squeeze out excess, cut down requirements.
Memory Analysis Tools
Before performing memory optimization, it’s essential to gain a rough understanding of the memory distribution of the UE project, which can be done using the memory analysis tools provided by UE and some native platform analysis tools.
Memory Analysis Resources:
Some common console commands:
1 | stat memory # Displays memory usage of various subsystems in the engine |
You can also enable LLM by adding the -LLM
parameter at startup.
1 | -LLM # Enable LLM |
Then you can use the following console commands during runtime:
1 | stat llm # Displays LLM summary. All lower-level engine statistics are consolidated into a single engine statistic. |
Memory analysis can also use the following tools:
- memreport
- MemoryProfiler
- Heapprofd(Android)
- Instrument(IOS)
Entering memreport
(-full) in the game console will create a directory and .memreport
file in Saved/Profiling/Memreports
, which can be opened with a text editor to view the memory usage of different parts of the game.
I will provide more details on the usage of specific memory analysis tools and the analysis process for memory allocation in the UE engine, as time allows. For more detailed information about LLM, please refer to the UE documentation.
Memory Optimization Solutions
The listed optimization methods are actually optional; it is not necessary to implement all of them to achieve the best results, as memory optimization must balance efficiency. Therefore, optimizations can be controlled based on project requirements across different devices while ensuring feature consistency for low-end machines.
Here, I will mainly list which parts of UE can be optimized and how to do so. Specific optimization data will be analyzed and supplemented over time.
Disable Unnecessary Feature Support
Based on the requirements, the following engine module supports can be trimmed:
- APEX: If the Nvidia APEX destruction system is not used, support for APEX can be removed during engine compilation. This can be set in BuildSetting or TargetRules by setting
bCompileAPEX=true
. - Recast(NavMesh): If the client does not need support for Recast during runtime and does not require local NavMesh navigation, NavMesh support can be trimmed at runtime. This can be set in BuildSetting or TargetRules by setting
bCompileRecast=false
. - FreeType: If FreeType font support is needed, it can be set in BuildSetting or TargetRules by setting
bCompileFreeType=false
. - ICU(unicode/i18n): The Core module of the engine supports unicode/i18n, which can be controlled via
bCompileICU=false
in BuildSetting or TargetRules. - CompileForSize: UE provides an optimization option that strictly controls size during compilation but sacrifices performance. This can be set in BuildSetting or TargetRules by setting
bCompileForSize=false
; if true, the compilation will use-Oz
for Android, or-O3
if false. - CEF3: Optional support for the Chromium Embedded Framework, Google’s embedded browser support. This can be controlled by setting
bCompileCEF3=false
in BuildSetting or TargetRules. - bUsesSteam: Whether to use Steam; can be disabled for mobile games, controlled through
bUsesSteam
in TargetRules. - SpeedTree: If the game does not require SpeedTree for vegetation modeling, compilation of SpeedTree can be disabled via
bOverrideCompileSpeedTree
in TargetRules. - Audio Module: If the project uses WWise or similar as an audio playback interface and does not require the built-in Audio module of the engine, that functionality is redundant and can be trimmed away.
- Internationalization Module: If multi-language support of the game does not rely on UE’s text collection and translation functionality, this module can be removed.
This can reduce the size of the static program after compilation and decrease unnecessary execution logic.
Control AssetRegistry Serialization
The AssetRegistry is mainly used in the Editor for resource searching and filtering operations, primarily utilized by the Content Browser, as described in the UE documentation: Asset Registry.
For the project, it may not be necessary to use it at runtime, but the AssetRegistry
module loads AssetRegistry.bin
into memory as soon as it starts, consuming memory unnecessarily if it is not needed.
Fortunately, UE provides methods to not serialize or partially serialize AssetRegistry data. During the construction of UAssetRegistryImpl
, the InitializeSerializationOptionsFromIni
function will read configurations from DefaultEngine.ini
and construct an FAssetRegistrySerializationOptions
structure to store the settings, which will be used in subsequent Serialize
functions to control what part of the data is serialized into the AssetRegistry
.
1 | void UAssetRegistryImpl::InitializeSerializationOptionsFromIni(FAssetRegistrySerializationOptions& Options, const FString& PlatformIniName) const |
This control method can regulate whether to create AssetRegistry.bin
at packaging and control which AssetRegistry data to deserialize at runtime (it will not impact DevelopmentAssetRegistry.bin
, which can be used for asset auditing).
The deserialization process is as follows:
- Check
bSerializeAssetRegistry
; if true, loadAssetRegistry.bin
in binary form into memory. - Use the
Serialize
function to deserialize the binary data. - Release the memory occupied by loading
AssetRegistry.bin
.
Therefore, the memory consumption of AssetRegistry is based on the serialized data, and FAssetRegistrySerializationOptions
controls which data gets serialized.
1 | /** Load/Save options used to modify how the cache is serialized. These are read out of the AssetRegistry section of Engine.ini and can be changed per platform. */ |
The configuration reading occurs in the following code:
1 | void UAssetRegistryImpl::InitializeSerializationOptionsFromIni(FAssetRegistrySerializationOptions& Options, const FString& PlatformIniName) const |
In Config/DefaultEngine.ini
, creating an AssetRegistry
section using the above names can control the serialization of AssetRegistry, which reduces the package size and memory consumption at packaging (the AssetRegistry is loaded into memory when the engine starts).
1 | [AssetRegistry] |
You can also specify settings for individual platforms by modifying the platform-specific ini files:
1 | Config/Windows/WindowsEngine.ini |
Load Only the Shader Quality Levels Needed
By default, the engine loads all quality level shaders into memory. If there is no need to implement quality switching, unnecessary quality levels can be avoided to reduce shader memory usage.
In Project Settings
- Engine
- Rendering
- Materials
- Game Discards Unused Material Quality Levels
:
Alternatively, you can add the following configuration to DefaultEngine.ini
:
1 | [/Script/Engine.RendererSettings] |
When running in game mode, whether to keep shaders for all quality levels in memory or only those needed for the current quality level.
- Unchecked: Keep all quality levels in memory allowing a runtime quality level change.(default)
- Checked: Discard unused quality levels when loading content for the game, saving some memory.
Reduce Shader Variants
You can reduce the number of parent materials and enable the following options in Project Settings
- Engine
- Rendering
:
Share Material Shader Code
In the packaging settings of Project Settings
- Packaging
, you can set Share Material Shader Code
and Shared Material Native Libraries
to reduce the size of the package and memory usage (which increases loading time).
1 | /** |
After enabling this, the packaged output will generate the following files:
1 | ShaderArchive-Blank425-PCD3D_SM5.ushaderbytecode |
However, if subsequent cook resource shader changes occur while the base package still contains old ShaderBytecode information, it may lead to material loss.
There are three solutions:
- Package the Shaderbytecode files in the pak during subsequent packaging and load them at mount time;
- During cooked hot updates, package the Shaderbytecode within the resource;
- Create a ShaderPatch for loading after hot updates.
For a more specific implementation process regarding hot updating Shaderbytecode, you can refer to my previous article UE4 Hot Update: Create Shader Patch.
Disable UMG Template Creation
The engine has a feature to cache blueprint controls for accelerated creation, but it can lead to memory wastage. This can be configured to be disabled:
You can also directly modify the engine’s code to provide default values through class initialization:
1 | UPROPERTY(EditAnywhere, AdvancedDisplay, Category=WidgetBlueprintOptions, AssetRegistrySearchable) |
This variable is checked in the following code:
1 | bool FWidgetBlueprintCompilerContext::CanAllowTemplate(FCompilerResultsLog& MessageLog, UWidgetBlueprintGeneratedClass* InClass) |
Disable PakCache
The engine has PakCache enabled by default. When reading files from a Pak, it will read additional memory for caching, which can be a considerable memory usage (viewed through stat memory
):
At game startup, PakCache logs will show:
1 | [2021.03.23-10.49.21:354][445]LogPakFile: Precache HighWater 16MB |
You can configure it to disable with:
1 | [ConsoleVariables] |
Disabling PakCache may lead to frequent I/O issues, but the specific performance impact details will need to be analyzed when there is more time.
Unload Pak Entry Filenames
Starting from UE4.23, the engine provides memory optimization configurations for mounting PakFiles:
1 | [Pak] |
When FPakPlatformFile
executes Initialize
, it binds to FCoreDelegates::OnOptimizeMemoryUsageForMountedPaks
, and this delegate can be invoked to notify PakPlatformFile
to optimize memory for mounted Paks.
1 | void FPakPlatformFile::OptimizeMemoryUsageForMountedPaks() |
- UnloadPakEntryFilenamesIfPossible: Allows unloading memory occupied by PakEntry filenames.
- DirectoryRootsToKeepInMemoryWhenUnloadingPakEntryFilenames: Directories to retain in memory when unloading PakEntry filenames.
- bShrinkPakEntriesMemoryUsage: Reduces the memory occupied by PakEntry.
After the call, if UnloadPakEntryFilenamesIfPossible
is enabled, it will save memory by hashing the list of filenames in the Pak. However, after unloading the Pak entry filenames, wildcard matching paths will no longer be usable.
1 | /** Iterator class used to iterate over all files in pak. */ |
Compress Textures
Texture compression is a lossy process that can reduce both package size and memory size when loaded. Although it is lossy, the reduction in quality on mobile platforms is often negligible, and settings can be adjusted based on project circumstances.
In previous notes, it was mentioned that the default asset quality and size level can be set in Project Settings
- Cooker
- Texture
- ASTC Compression vs Size
:
1 | 0=12x12 |
You can also set it individually for a specific Texture in the texture asset editor:
Lowest->Highest corresponds to values 0-4
, using Default will apply the project’s settings.
Additionally, setting the Compression Settings
type will also influence the type of compression applied; Default uses the project’s settings parameter, setting it to NormalMap type will lead to ASTC_4x4
.