DEV Community

Cover image for UI Subsystem Iteration
Oskar Mendel
Oskar Mendel

Posted on • Originally published at oskarmendel.me

UI Subsystem Iteration

The title already spoiled the subject as we’re going to dive into the UI system that has been implemented into my engine. In this post I will outline some of the improvements made to the current ui system that I've implemented for Maraton.

Graphical User Interface

A User Interface or UI for short is our medium of interaction with the machine. One of the most important goals that designers of interfaces have is to not have their users think too much about the interface itself. What this means is not that we want mindless and lazy users but instead, we want our interface to be as intuitive as possible to not force users to have to reason about what things do or how to do things.

The first step to address this is to build your user interface with components that users will already make the correct assumption on how to use. These components can be:

  • Buttons
  • Checkboxes
  • Radio buttons
  • Text fields
  • Sliders

Why this matters at all are because we’re building our UI system from scratch so we have the opportunity to define the behavior of all of our components ourselves, if we’re not trying to explicitly try to redefine how the components should work it wouldn’t make much sense to make the behavior different to how they work anywhere else. The reason for this is because as mentioned above that our users would most likely make the incorrect assumption of what happens when they for example click a button which will result in the user losing time and most likely getting frustrated.

For a button, the set of rules of operation may be pretty simple but for a text field, it can become more complex as it would involve tons of new keybinds, and perhaps you would need to implement a cursor allowing the user to see where they are when they’re writing which itself has a different set of behavior. All of the standard components actually have analog counterparts that we’re taught since childhood, for example the keys on your keyboard which acts just like the buttons inside an elevator. Or the knobs on your stove or microwave to control the heat acts like a slider.

This brings us to our first constraint on our user interface system: “Do not defy the standard behavior of user interface components but be flexible enough to allow it.

Example of a control panel showcasing an analog interface.

Example of a control panel showcasing an analog interface with buttons, togglers and sliders (Source: Control Desk).

UI Modes

Designing a UI system has many exciting aspects to it, there are generally two styles of implementations for UI systems and they have two camps of developer communities split between them.

Retained mode

The first style is known as “Retained mode UI” and I would like to say that this is the most common style as it is adopted by many popular UI frameworks. The word “Retained” comes from the idea that this type of system retains the state of the UI across frames, the main idea is that the library or system retains the data of the scene and objects used to build it. Using this style you build an API that hides details from the user about the internally used objects, rendering primitives, and interaction with the different widgets are often done in an indirect fashion where you tell the library what you want to update and rely on it taking care of it. Retained mode is often but not limited to being built using the object-oriented style of programming.

Abstract diagram of retained mode ui. (Source: [https://learn.microsoft.com/en-us/windows/win32/learnwin32/retained-mode-versus-immediate-mode](https://learn.microsoft.com/en-us/windows/win32/learnwin32/retained-mode-versus-immediate-mode))

Abstract diagram of retained mode ui. (Source: https://learn.microsoft.com/en-us/windows/win32/learnwin32/retained-mode-versus-immediate-mode)

Why this idea that is retained often is coupled with the object-oriented approach is that this model tends to be designed in a way where the users have to extend existing elements to build new constructs within the boundaries of the library. Hence this type of library is often designed in a way where the users are required to extend and inherit pre-defined behavior to create new elements that are not supported by default by the library. There are however no such limitations to this style, where extensibility in a procedural approach can be achieved with for example function pointers. If you are interested in diving further into this I recommend a tutorial that a developer that goes by nakst has put together that is available here. He also has a good library to learn from available on Github here.

Immediate mode

The second style that has grown in popularity in more recent years is the “Immediate mode UI” style. This approach has lately been frequently adopted by game developers because of its simplicity and how easy it is to learn and get started with especially in an environment where rapid iterations are of interest. In an Immediate mode system, the code that defines for example a button is also the same code that handles the rendering and code that checks if it was interacted with or not. This results in an extremely simple API for developers to consume were creating a button in your interface may becomes as simple as an if statement, for example:

if (Button("Click Me"))
{
    DoSomething();
}
Enter fullscreen mode Exit fullscreen mode

If you do not want the button to be rendered in this case you simply do not call the Button function. This means that a lot of code that you write will be code that decides if the UI should be drawn or not but the simplicity of it all remains the same.

Abstract diagram of immediate mode ui. (Source: [https://learn.microsoft.com/en-us/windows/win32/learnwin32/retained-mode-versus-immediate-mode](https://learn.microsoft.com/en-us/windows/win32/learnwin32/retained-mode-versus-immediate-mode))

Abstract diagram of immediate mode ui. (Source: https://learn.microsoft.com/en-us/windows/win32/learnwin32/retained-mode-versus-immediate-mode)

In this style of library, all the data required is within the client code and not inside a library. This means that more weight is put on the user of the library which also allows for more freedom. Because of the “immediate nature” of the API and the fact that the user holds the data, the types of libraries that implement this tend to not store any internal state about what to draw and not which results in the immediate type interfaces often being redrawn every single frame. While some may argue that this is a drawback because it will result in a slow application I wouldn’t bother that much because the UI would have to be very complex before refilling your vertex buffers are going to be the bottleneck in your application and you will most likely find that optimizing other areas of your application will give more results than rendering your interface. One real drawback that between this and a retained mode style approach is that the library itself cannot make assumptions about what is and is not visible on the screen which can be a limitation for some projects.

Due to the simplicity of the immediate type of approach the libraries that you use also tend to introduce less bloat into your application, this is even part of the slogan for the most popular immediate mode libraries “Dear ImGui”. At the end of the day though you can build very powerful systems with both approaches and I for one would be interested in sinking my teeth into retained-mode in the future.

I have since chosen to stick with the immediate mode style because of the beforementioned reasons and with this we also get two more constraints for our UI system: “Designed for developer’s ease of use” & “Do not get in the way of development”.

The previous UI system

The old UI system for my game engine Maraton was an immediate mode-style implementation that had a few interesting concepts attached to it. This UI system was designed to work and not to be the most flexible system out there, but while working it was also designed to be flexible enough to be able to construct real applications. One of the most limiting factors of this system was that every widget was hard-wired to be a part of the system itself.

The code defining widgets in this system was built upon the idea of discriminated unions (also known as “tagged unions”), it was all built on code that looks like the following:

enum ui_widget_type
{
    UI_WidgetType_Button,
    UI_WidgetType_Slider,
    ...
};

struct ui_widget
{
    ui_widget_type Type;

    struct
    {
        ...
    } button;

    struct
    {
        ...
    } slider;
};
Enter fullscreen mode Exit fullscreen mode

While I love this structure of code it has serious issues when trying to build a flexible system because we are already declaring a button and a slider to be two separate things while in practice they are very similar and utilize many of the same properties. This design gave me many troubles with the engine in the past because every time I wanted to support something new like a button using a texture instead of text and then a toggler with the same thing then perhaps a scrollable list, it all ended up with me having to do quite a bit of work inside the internals for the engine and the library code gave me very little power when it comes to extending it or making small tweaks. This already breaks the first and third constraints that we set up for our UI system.

Another lacking part of the system was that it did not have any internal support for autolayouting, this means that the entire system was based on you keeping track of where everything is. There was a concept of rows and columns but how big they are and where they are located on the screen was part of the API and something the user had to keep track of, an example UI from my game Brainroll looked something like this:

UIPushTextColor(UI, Vector4Init(1.0f, 1.0f, 1.0f, FadeInOutTransparency));
vector2 Size = Vector2Init(400, 80);
vector2 Position = Vector2Init(ScreenWidth / 8.0f, ScreenHeight / 2.0f - (Size.Y * 5) / 2.0f);
UIPushColumn(UI, Position, Size);
{
    UITitle(UI, "Brainroll");

    if (UIButton(UI, "Play"))
    {
        PlayUIClickSound();
        State->LevelSelectMenuOpen = true;
    }
    if (UIButton(UI, "Editor"))
    {
        PlayUIClickSound();
    }
    if (UIButton(UI, "Settings"))
    {
        PlayUIClickSound();
        State->SettingsMenuOpen = true;
    }
    if (UIButton(UI, "Quit"))
    {
        PlayUIClickSound();
        Quit = 1;
    }
}
UIPopColumn(UI);
UIPopTextColor(UI);
Enter fullscreen mode Exit fullscreen mode

You can see on like 2 and 3 that I am calculating where this UI should be placed on the screen and what size it should be. For this specific case, it is not that fatal but it is still something that I do not like having to do while performing fast iterations on for example a game in this case. Thankfully in this example the buttons have their size calculated depending on the text size so they will be laid out next to eachother vertically in a column.

Last but not least the previous system had no notion at all of the keyboard navigation, it was designed like keyboard navigation wasn’t a thing and this caused me a big pain because I know how frustrating it is for a player of a 2D puzzle game that only plays with the keyboard to always have to grab the mouse because for some reason it doesn’t react to keyboard input in the menus. While performing Brainroll playtests this has been a feature that almost everyone has asked about.

Before moving on I still want to touch on some of the good parts of the system. In the above code example, you can see the pattern of Push and Pop being used to almost push a text color and also for the Column. This is a concept that I use in many places of Maraton, I was introduced to this concept by the name of “Push buffer”. The idea here is that we have a set of internal stacks for different things such as colors, fonts, and positions that will modify all widgets until we pop them off the stack. Since it is a stack and if we push several things onto it, the system will then always peek at the stack and use the last thing pushed onto it when creating its elements. This is an idea that works very well and was transferred over to the new UI implementation.

Another very positive thing with the old system was that it had what I call a “backdoor” for things that I knew I never would implement as a standard widget, this was a way for you to through function pointers define your widgets, and this allowed you to create two callback functions one for what happens during an update and one to specify what happens during rendering of the widget. This was powerful enough for the cases where I needed to implement something new and wanted it to somewhat interface with the rest of the UI system.

Harder, Better, Faster, Stronger

I was sitting in a dark room hacking away in the night, in the middle of a short sprint developing a new renderer for Maraton in preparation for supporting multiple renderer backends. While taking a small break from reading discord messages I see something that catches my eye. Ryan Fleury who is a very talented developer that has inspired me multiple times and helped solve many issues and give plenty of good ideas applied to Maraton, released a new article on his substack explaining his new UI system and while reading it I got more and more captivated by it. Within this article Ryan explains the UI system solving all of my issues and I decided to scope-creep my current sprint and implement this in the engine before continuing work on my game Brainroll. I will not go into too much detail about the internals of this system and instead refer all the traffic to Ryan's site but I will however explain the parts of the system that solve my previously mentioned constraints.

Flexible components & developer freedom

Instead of tagged unions, Ryan disconnects every widget from the idea of having a type and instead into adopting a set of properties. For example, a button can have the following properties:

  • Background
  • Text
  • Clickable
  • Hover Animation
  • Click Animation

Instead of building this as a button widget, we build an abstract widget type that can have all of these properties toggled on or off. I never thought about this myself but when laid out in front of me this is a very flexible and strong approach to managing the building blocks of the interface as once one property is added it can be used by any component. Representing this in code is very simple, instead of using the enum type inside your widget you use an integer type and define your enum as a set of bitflags allowing you to toggle any flag on or off. For example:

typedef u32 ui_widgetflags;
enum
{
    UI_WidgetFlag_Background     = (1<<0),
    UI_WidgetFlag_Text           = (1<<1),
    UI_WidgetFlag_Clickable      = (1<<2),
    UI_WidgetFlag_HoverAnimation = (1<<3),
    UI_WidgetFlag_ClickAnimation = (1<<4),
};

struct ui_widget
{
  ui_widgetflags Flags;
};
Enter fullscreen mode Exit fullscreen mode

The different flags don’t have to apply to every process step of the widget’s lifetime but instead, it can only be managed where it matters, for the button example above the flags UI_WidgetFlag_Background & UI_WidgetFlag_Text will only be toched by the rendering code while the UI_WidgetFlag_HoverAnimation may be touched both by the code performing the animation and the rendering code. So every process step that the widget goes through can look at what flags the widget has activated and decide whether to perform part of the processing or not depending on its flags.

This type of system also streamlines the code that manages the user input because every widget can just be treated the same, it doesn’t matter if it's a button or a slider we still record if it was clicked or not as well as where it was clicked. This becomes very dynamic because the core UI system will technically support almost any widget you want and you can accompany the core itself with a library of basic widgets.

Looking back at our first constraint, this is no longer an issue because with this system nothing is preventing you as a developer from defining any component with any behavior. You can also fairly easy with this system combine several widgets almost like legos to build new widgets in your system. As mentioned earlier for a more in-depth look at how this is done I will refer to Ryan’s articles.

Ease of use

I mentioned earlier that I felt like I was missing a system that aids me with layouting because it would save me tons of time when going through iterations on a codebase. The system Ryan came up with also takes care of this for you. Remember earlier when I mentioned that most immediate mode systems re-build the entire UI every frame? What this system does is that while building a hierarchy of widgets it also caches their state as long as they are in use. The reason behind this is that when for example you are pushing a button into the current scene since the system is immediate you are only working with a partial scene and some assumptions that we make at this stage may change because we do not have access to the complete hierarchy at this point.

If we, however, separate the building of the hierarchy and rendering step we are free to inject an autolayouting step in between which allows us to work with the entire hierarchy and allows us to make advanced layouting decisions as a result.

The way this is achieved in this system is that the underlying data structure is constructed as an n-ary tree. Each widget has a parent, siblings, and children that can form a sort of hierarchy structure internally. If you google n-ary trees most examples do not keep references to siblings, an example I drew on paper may look like this:

IMG_0006.jpg

In code this can be written like:

struct ui_widget
{
    // tree links
    ui_widget *First;  // First Child
    ui_widget *Last;   // Last Child
    ui_widget *Next;   // Next Sibling
    ui_widget *Prev;   // Previous Sibling
    ui_widget *Parent; // Parent

    ...
};
Enter fullscreen mode Exit fullscreen mode

So our widgets are cached in-between frames like it would in a “retained-mode” design we still utilize the features of an immediate mode API. Since at the time we perform layouting we have access to the entire hierarchy we can perform quite advanced layouting and can be just as sophisticated as any algorithm applied to a retained-mode hierarchy.

The layouting itself is boiled down into something that we refer to as “Semantic sizes”. This is a way for us to assign each widget a way how it should be managed by the autolayouting system. In my implementation I have the following semantic sizes:

  • Pixels - A “hard” set size specified in pixels
  • TextContent - Size is set to be the text of the widget in the used font’s size.
  • PercentOfParent - The widget is a specified percent of the parent’s size.
  • ChildrenSum - The widget’s size is a sum of the size of its children.
  • BiggestChild - The size is set to be the same size as the biggest child widget.

Each semantic size has one of the types above as well as a value and a strictness. The value is not always used but for example, for pixels, it would be the pixel amount and the percent for the PercentOfParent. The strictness is the interesting part, it is a way for us to predetermine how much of the widget’s size in percent it is willing to give up in case it doesn’t fit within the layout.

Each widget adopts two semantic sizes, one for each axis. This is extremely powerful because it allows us to make advanced decisions where the layouts can differ on X and Y, for example, we can have a container that is hard set to 250 pixels in height but it will always be 50% in width.

The autolayouting algorithm itself traverses our tree of widgets once per axis in 5 different steps:

  1. Calculate standalone sizes. Theese are the widgets with semantic sizes that are not dependant on any other widgets in the hierarchy(Pixels & TextContent). This can be done in either post or pre-order traversal of the tree.
  2. Calculate the “upwards-dependent” sizes. This are the widgets with semantic sizes that depends on their parent (PercentOfParent). This should be done in pre-order traversal.
  3. Calculate the “downwards-dependent” sizes. This is the widgets whos semantic size depends on their children (ChildrenSum & BiggestChild).
  4. Solve size violations. At this step we traverse the entire hierarchy and verify that no widgets extends past their parent’s boundaries. Exceptions are made for parents that do allow overflow (usefull for cases where you want to implement scrolling). In case of a violation it will cut off part of the child’s size depending on the strictness defined within it’s semantic size. This should be done as a pre-order traversal step.
  5. Compute the final screen coordinates and the relative positions between every widget. This step produces the rectangle that will be used for rendering and user input. This should be done in pre-order traversal.

This covers most cases while also being very simple to extend. As a user of this API, it will save me a lot of time as I do not have to worry about the placement of individual elements as the auto layout together with the specified semantic sizes can be utilized to take care of the thinking for me.

Another very important thing is that this type of implementation makes keyboard input quite trivial. I solved this by building a quadtree of directional links during the layouting step, this allows me to based on each widget’s position within the tree build another tree with the default behavior of which widget you would be taken to on directional key navigation. The way this is done is by looking at the spatial distance between the widgets on the screen in all 4 directions and selecting the closest widget (that allows user interaction) and then building up a tree of links to widgets in the real hierarchy. This also allows the user to set their shortcuts to different parts of the UI.

This solves the remaining constraints we had on this UI system. And the resulting system is a system that:

  • Allows the developer to utilize a more granular type of building blocks that allow them to define a wide variety of UI components by thinking of properties instead of hard specified components.
  • The system is designed with ease of use in mind. The API is small and consistent and everything follows the same push / pop pattern. As the system is still utilizing stacks internally it is trivial for new users to get a grasp of the basics and modify and style parts of the UI with limited knowledge.
  • Compared to the old system this approach doesn’t get in my way as much as I can worry about the content and postpone the exact layout as the algorithm will take care of it for me. I do not have to worry about new widget properties as everything I have now is for sure enough to complete my next game and as a bonus the navigation is solved and requries zero developer maintenance to be supported.

Result

In the end I have a more robust and production ready UI system inside Maraton than before. I wouldn’t claim that this is the silver bullet that solves all my UI problems for the rest of my career but it for sure is an iteration in the right direction. Here is a sample screenshot of one of the UI examples that is shipped with the engine.

Untitled

Here is a visual representation of the autolayouting algorithm at work:

Untitled

While this took far too long for me to get in place I am very happy that I went on this journey because I got to learn an enourmous amount of new things. This is the work I wanted to get done before allowing others to take part of the Maraton project and it is now finished and now it is time for me to finish my game Brainroll.

If you are interested in supporting my projects, me becoming an indie game developer or just want access to my games as well as the Maraton game engine it is available to my patreons here.

You can also follow my work on theese other places:

💻GITHUB 🐦TWITTER 🙏PATREON 🗨DISCORD 📹YOUTUBE

Top comments (0)