UE hot update: Fault analysis of a resource anomaly

UE热更新:一次资源异常的故障分析

Recently, I encountered an extremely bizarre bug involving two maps. One map, A, can be accessed with its PAK placed in the engine’s automatic mount directory, but it cannot be accessed from the hot update directory. The other map, B, behaves completely oppositely: it is abnormal in the automatic mount directory but works normally in the hot update directory.

At first glance, the issue appears entirely elusive, with two mutually exclusive behaviors occurring within the same logical framework. Moreover, the hot update mount and the automatic mount only differ in timing and priority, so this problem shouldn’t theoretically exist.

While the issue can ultimately be resolved on the business logic side, this behavior involves another very obscure path within the engine. Understanding why and how it works is crucial. Therefore, I analyzed the engine’s code based on this behavior, came to a reasonable conclusion, and devised a method to detect and mitigate this issue.

This article assumes that readers have some basic knowledge of UE hot updates; if in doubt, please refer to other articles in this blog’s hot update series for more information.

Note: This article is based on the debugging and analysis of the UE4.25 engine version. The code in other engine versions may differ slightly, but I focus on the analysis process and the causes of the failure, which can also serve as a reference for other versions of the engine.

Mount Timing Analysis

In general, regarding the engine’s PAK mounting, we only need to focus on “priority” and whether resource loading occurs after the highest priority PAK is mounted, to ensure that the latest resources can be accessed.

However, the engine’s automatic mount differs logically from manual in-game mounts, depending on the current state of the engine at the time of mounting.

Specifically, this is related to whether FCoreDelegates::NewFileAddedDelegate.Broadcast(Filename) is called after FPakPlatformFile::Mount is successfully executed:

Here, we are simply checking whether FCoreDelegates::NewFileAddedDelegate is bound. The condition affecting its execution is whether the PAK mounting occurred after this binding.

The binding code for this delegate occurs in Obj.cpp, where it binds FLinkerLoad::OnNewFileAdded:

So when the engine starts, it mounts the PAKs in the automatic mount directory very early, during PreInitPreStartupScreen:

This makes sense because all configurations, properties, and assets of the engine are stored in PAK files, and they must be mounted first so they can be found, allowing the engine to execute smoothly.

The calling of the InitUObject function in Obj.cpp occurs during the AppInit call from PreInitPreStartupScreen:

Thus, it happens later than the engine’s automatic mounting, meaning that during the automatic mount process, NewFileAddedDelegate will not be called.

In contrast, with file mounting after hot updates, since the game within the engine is already running, the delegate is bound, and calling FPakPlatformFile::Mount to mount the PAK will naturally execute the FCoreDelegates::NewFileAddedDelegate.

The only difference is the timing of the mount. This is also a key point for our upcoming bug analysis; just remember that there will be a difference in the execution of FCoreDelegates::NewFileAddedDelegate between automatic mounting and manual in-game mounting.

Failure Analysis

Now returning to the bug introduced at the beginning of the article, I was initially puzzled as to why there were differences between automatic and manual mounts. So I closely analyzed the logic bound to FCoreDelegates::NewFileAddedDelegate in FLinkerLoad::OnNewFileAdded, and found that it’s not particularly special:

In FPakPlatformFile::Mount, after successful mounting, if NewFileAddedDelegate is bound, it will call FCoreDelegates::NewFileAddedDelegate for all files in the current PAK:

1
2
3
4
5
6
7
8
9
if (FCoreDelegates::NewFileAddedDelegate.IsBound())
{
TArray<FString> Filenames;
Pak->GetFilenames(Filenames);
for (const FString& Filename : Filenames)
{
FCoreDelegates::NewFileAddedDelegate.Broadcast(Filename);
}
}

Therefore, when manually mounting the PAK, this logic will run, and all files within the manually mounted PAK will follow this process.

The logic defined in LinkerLoad.cpp for FLinkerLoad::OnNewFileAdded is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
void FLinkerLoad::OnNewFileAdded(const FString& Filename)
{
FString PackageName;
if (FPackageName::TryConvertFilenameToLongPackageName(Filename, PackageName))
{
FName PackageFName(*PackageName);
if (FLinkerLoad::IsKnownMissingPackage(PackageFName))
{
FLinkerLoad::RemoveKnownMissingPackage(PackageFName);
}
}
}

This logical flow converts the incoming Filename from the UFS path to a PackageName, effectively transforming (../../../PROJECTNAME/Content/XXXX to /Game/XXXX), and constructs an FName to check if it is a MissingPackage; if so, it will be removed.

The entire function’s logic is designed to remove items marked as MissingPackage. So when does a resource get added to MissingPackage? When loading that resource fails!

The reason is straightforward: if loading a resource fails, it gets recorded. If there are subsequent attempts to load the resource, you can check if it was previously marked.

The logic within OnNewFileAdded operates inversely. If a new PAK is mounted, it implies new resources could exist. It converts the file list from the PAK into resource paths and checks if they’re marked as missing; if so, it removes them, affirming the resource is now accessible since it’s been mounted in the new PAK.

After debugging, it becomes evident that the logic never enters the RemoveKnownMissingPackage function, indicating that the detection for MissingPackage has been unsuccessful.

The function’s logic is thus simple, yet it does not clarify why automatic and manual mounts lead to maps not being accessible. However, if I disable this logic, the originally inaccessible map A works, while map B does not. Clearly, this logic is causing the problem; it just remains to determine the correlation.

Ultimately, by comparing the assets during actual loading with the PackageName within OnNewFileAdded, I discovered the real issue lies in constructing the FName!

1
2
3
4
5
6
7
8
9
void FLinkerLoad::OnNewFileAdded(const FString& Filename)
{
FString PackageName;
if (FPackageName::TryConvertFilenameToLongPackageName(Filename, PackageName))
{
FName PackageFName(*PackageName);
// ...
}
}

This is the problematic code, and the specific reason will be discussed later.

For now, you need to know that FName is a mechanism in the engine used for static strings, having some characteristics: identical strings are only recorded once and accessed via indexing. Furthermore, it is by default case-insensitive, which is the crux of the issue.

Anomalous Asset Operations

Why does the construction of FName become the culprit? Let’s set aside the debugger and the packaged game for now.

Back in the editor, consider the following situation:

  1. Create a new map located in the PrimaryTest path:

  2. Directly modify the folder name in the operating system to change its case to PrimaryTEST:

You can see that there’s a change in the project’s directory, but the PrimaryAssetName within the assets remains unchanged.

  1. When packaging a PAK, the UFS path of the uasset aligns with the disk’s relative path, maintaining the case consistency:

However, this results in a critical issue: the cooked uasset path and the actual asset path exhibit case mismatches.

When reading from the asset serialization, it remains in lowercase:


This constructs an FName like /Game/xxxx/level/yyyyy, yet its uasset in the UFS path is Content/xxxx/Level/yyyy.

The truth is near!

Returning to FLinkerLoad::NewFileAdded, let’s take another look at its code:

1
2
3
4
5
6
7
8
9
void FLinkerLoad::OnNewFileAdded(const FString& Filename)
{
FString PackageName;
if (FPackageName::TryConvertFilenameToLongPackageName(Filename, PackageName))
{
FName PackageFName(*PackageName);
// ...
}
}

This conversion translates the UFS path to a PackageName! The conversion logic merely operates on the string; the path relative to Content gets preserved as is!

For example, if a resource is located at /Game/Test/PrimaryTEST/NewWorld, its cooked path in the UFS is:

1
../../../PROJECT_NAME/Content/Test//PrimaryTEST/NewWorld.umap

When converted through FPackageName::TryConvertFilenameToLongPackageName, the resulting PackageName will be:

1
/Game/Test/PrimaryTEST/NewWorld

And then it constructs it as an FName!

This will lead to a case difference between the PackageName derived from UFS and the actual one stored within the asset.

Debug Verification

After analyzing the issue, it is crucial to perform real debugging to validate the problem, as practice is the only test of truth!

When executing FLinkerLoader::NewFileAdded, the conversion from UFS path to PackageName relies on the UFS file path, which indeed is case-sensitive:

Here, an FName with a value of /Game/xxxx/Level/yyyyy is constructed, and this string is recorded in the NamePool.

When loading this map, the fetched FName is also now case-sensitive:

It is evident that the Level directory has turned uppercase.

If there is any logic in the game that matches asset paths and is case-sensitive, this leads to issues where matching fails, exactly as this incident has showed.

Failure Summary

Let’s revisit the initial question: why do resources placed in the automatic mount directory yield a logical discrepancy compared to those manually mounted after hot updates?

The answer: This occurs because the PAK in the automatic mount directory does not construct any FNames for the files before binding the FCoreDelegates::NewFileAddedDelegate, while manual mounts after hot updates will trigger FLinkerLoad::NewFileAdded, creating FNames for those files. The underlying cause lies in the case differences between the asset path in UFS and what is recorded within the asset, leading to certain checks in the game logic failing.

How to Mitigate?

Although the problem has been traced, its complexity arises from several different aspects:

  1. Non-standard operations by artists on resources.
  2. Incomplete consideration in game logic checks.
  3. The design of the file system indeed contributes to this issue, and it’s quite misleading.

Ultimately, the core lies in modifying resources directly on the file system, bypassing the Editor, which resulted in inconsistencies between the asset storage path and recorded values, with the modification being only a case change, rendering this fault extremely covert.

Describing it as “a disaster caused by non-standard resource operations leading to case anomalies” would not be an exaggeration. To prevent a recurrence of such situations, I added a new resource check rule to automatically detect inconsistencies between disk paths and records within the PKG:

This enables timely bulk checks of assets in the project that share the same problem, facilitating remediation. As I’ve repeatedly stated in past articles, resource standardization is vital, greatly reducing the cost of troubleshooting and nipping problems in the bud.

Beyond that, the most crucial aspect is to manage personnel. Any modifications to assets must be done through the Editor, including but not limited to adding, deleting, renaming, moving directories, etc., and redirection must be repaired afterward. Direct modifications via the file system directory are strictly prohibited.

Conclusion

This article analyzed the logical differences arising from varying mounting timings in Unreal Engine, identified the abnormal operations leading to mismatches between resources and their serialized paths, and established clear conclusions regarding the failure along with mitigation strategies.

Technical issues are not mystical; if something feels mystical to you, it merely means you have yet to ascertain the connection between them.

The article is finished. If you have any questions, please comment and communicate.

Scan the QR code on WeChat and follow me.

Title:UE hot update: Fault analysis of a resource anomaly
Author:LIPENGZHA
Publish Date:2024/10/17 17:30
Word Count:9.1k Words
Link:https://en.imzlp.com/posts/22890/
License: CC BY-NC-SA 4.0
Reprinting of the full article is prohibited.
Your donation will encourage me to keep creating!