Arda Kizilcay

Welcome to my continued journey on the Commercial Vehicle Simulator Inspections application, a project that showcases my UI design, 3D prototyping, technical design, and implementation skills.

In Part 1, I covered the design process up to the delivery of a design prototype in Figma. In this post, I will be walking through the phases of Unity 3D prototyping, technical design, and implementation of the minimum viable product (MVP).

Prototyping

As part of the design and prototyping phase, I collaborated with another designer to create a functional Unity prototype. We leveraged wireframes from the earlier design phase (as discussed in Part 1) and expanded upon them, focusing on low-level functionality.

For the prototype process, I handled UI and related functionalities, while the other designer prototyped a camera that could be moved along the truck.

Camera and UI

The video above is from when we had all the essential functionalities in the prototype including the UI sidebar, floating icons, interactive (by tapping) truck parts, UI overlays, and the camera with controls. At this stage, due to the goal of rapid prototyping, we opted to have the camera on a dolly path (via Cinemachine) and have a slider that lets the user move the camera along the path.

Contextual interactions

"Contextual interactions" was the working name for what was imagined as non-standard interactions with specific truck components. We wanted to create some examples of these interactions as Unity prototypes as proof of concept.

The one I focused on was how a driver might check for loose fasteners by running their hand along the lug nuts.

I've created this sketch to convey my concept to my fellow designer.

The main challenge came with translating touch-screen inputs into a virtual hand running along the lug nuts. I suggested we put the hand on a circle rail along the lug nuts and then translate the user's touch input location to the closest point on the rail circular rail.

This is the isolated prototype that came out of the concept. The green circle corresponds to the user's finger upon touching the screen.

Here is the overlay that lets you engage in the contextual action

Once the part states are marked in the sidebar, they could then be visualized on the truck

The resulting prototype created a strong impression on our team and our development leads, leading to me taking on additional responsibility on top of design - I was tasked to also independently building the production version of this project.

Production systems

The inspections application was being developed from the ground up as an independent Unity application. However, it needed to coexist with the existing VR truck simulator and the codebase that powered it.

The team already had a robust architecture that was intended to not only power the VR truck simulation, but also any form of VR simulation the company might decide to take on in the future. This meant I had the challenge of creating something completely new and unique compared to the usual category of product created at this company, while having it exist in an architecture not quite imagined for this type of application.

To elaborate a little bit: This inspection application was simply not a simulation and the priorities and conventions of simulation did not necessarily apply. In addition to this, as the development went on, I began advocating for this application as a generic inspection tool that could be used to inspect anything that could be expressed as a 3D asset. This created an interesting challenge to consider regarding dependencies in the VR truck simulation.

The system components I created and serialized as valid components of the existing Simulation Application architecture

Camera system

The camera, in this case, should be analogous to the real-world counterpart of a human moving around the truck. In this specific case, we did not want to allow the camera to be in a perspective that a human would not be able to reach in the real world. While it's tempting to leverage the power of our virtual environment to empower the user to see the truck from anywhere they please, we have to remember that ultimately the user needs to go out into the real world and conduct these inspections under the constraints of where their bodies can physically reach. That being said, we can leverage our virtual environment by simplifying how they navigate to the various perspectives required to reach each perspective required to see or interact with a component.

Prototype flaws

In the prototype, the camera was implemented using Unity's Cinemachine camera package. This package gave us a framework we could use to quickly prototype. We used the dolly cart system and created a track for the camera to follow while it slid along the track. Then we used a basic slider in the UI to move it along the path. While this was fine for getting a proof of concept out, it came with quite a few problems and drawbacks.

A rough idea of what the dolly tracks in Cinemachine look like

Firstly and foremost, while Cinemachine is a very capable and feature-packed camera framework, trying to force it to work for our specific needs meant that we would have a compromised experience both authoring the camera ranges and positions, but also in terms of how the user interacted with the camera.

Thinking of the content author

Up until now every time I thought of user problems and the respective solutions to consider, I had only been considering the end-user. However, there's an entirely different user involved in the application that I needed to start thinking about. Through a combination of seeing how unintuitive and complex the authoring of dolly paths in Cinemachine is, but also through seeing the other designers and artists struggle with authoring content for the VR environments, I started to realize I needed to also design solutions to address the content authors problems.

I opted to label this role "Content author" or even just "Author" as even though this role gets often delegated to designers, in reality, it could be anyone doing such work depending on the team and environment.

The solution

So an ideal solution is going to be one that can address these criteria:

The end-user should be able to see/access no more or no less than they would in a real inspection
The camera should have intuitive touch-based control that caters to their experience with mainstream touch devices
The author should be able to define camera rules and relationships in a systemic way
The author should only have to define camera rules and relationships once per asset (in this case truck)

Creating a custom solution meant that we could break the dependency on Cinemachine and create a more lightweight, purpose-built system.

The camera system would consist of "camera zones". Each zone has a type of camera defined and unique parameters that correspond to the area that the camera zone belongs to. To begin with, we had two types of cameras: a line segment camera and a point camera.

Point camera

The point camera is essentially a fairly standard first-person, but able to be dollied via pinch gestures. The author would define the position of the camera, its starting orientation, and its range of motion in degrees.

Not the most exciting type of camera zone, but an essential.

The point camera being used in the interior

Line segment camera

The line segment camera was conceived out of the challenges of maneuvering around the elongated shape of the full truck, without relying on a free-form camera that the user has to grasp how to maneuver.

The prototype revealed that when the camera on the dolly path was focused on the center of the truck, it would be impossible to get a clear view of the sides of the truck at the very front and back. This could be temporarily addressed by having a second track that went along the center of the truck for the dolly to track as it went along its path on the outside of the truck. However, to achieve this outcome one would have to manually author a perfect symmetrical dolly path around the, manually author a target track along the center line of the truck, and manually author a whole bunch of Cinemachine settings and variables.

While we could have created a layer of tooling on top that automated the authoring of all these pieces, it went in the opposite direction of the reduced dependency and complexity style of architecture the company had. This was around the time our programming lead directed me to a presentation called Simple Made Easy by Rich Hickey to give me insight into why and how to avoid unnecessary complexity.

‍

There is no position on this dolly track where the user can get a practical view of the wheel or door

Compare the angles at the corner of the truck by the door above in the prototype versus below in the line segment camera.

Much easier to get a practical view of the door and wheel

The line segment camera would require the author to define the following:

Start and end point for a line segment
Radius value that corresponds to the general size of the target object
Minimum and maximum distance for how far the camera can be dollied towards and away from the object
Starting distance and position

Now rather than having the author wrestle with tracks and settings in Cinemachine, they can rapidly define new cameras using the few required settings. Below you can see how quickly the camera zone intended for the exterior of the truck can be converted into a camera zone intended for the engine bay.

How an author could rapidly create a camera meant for the engine bay

In addition to the above benefits of simplicity and rapid authoring, this system is completely agnostic to the nature of the object it is being used to inspect. This generic nature meant that in the future it could have theoretically been deployed to inspect assets that are not necessarily trucks.

Imagining how an airplane could have camera zones set up for Inspections using this type of camera

The camera zone-defining data would then be stored in the prefab for each type of truck. Depending on the inspection scenario and what it entailed (tutorial/practice/exam) the author could further define which zones are available in what pairings with the UI on a per Unity scene basis (this will be elaborated on in the next section). The camera's zones defined by the author can then be paired with the UI system to trigger transitions and the availability of corresponding clickable objects in the UI zone.

Future plans

There were quite a few plans for expanding this system as we progressed through iterations of the Truck Simulator as a whole. There were plans for implementing constraint systems so the camera could have more robust constraint definitions such as defining a floor and ceiling that the camera could not travel through regardless of the camera zone style.

This system was always designed as a framework to allow the company to scale and introduce more camera zone styles as needed by future problems (for example an orbital camera zone or a plane-based camera zone).

UI System

The UI is implemented using Unity's new UI Toolkit. The development of this project was right around the time Unity was starting to roll out the new UI Toolkit release build versions. As someone who has spent a fair share of time dealing in box-model oriented UI implementations through mobile and web, I always found Unity's original GUI implementation to be less than ideal to work in for box-model style UI's. Unity was developing the new UI Toolkit to respond to this type of problem. After doing my share of research and experimenting, I decided to advocate for a from-scratch UI implementation using the new UI Toolkit over licensing one of the many existing UI packages that were implemented using Unity's prior GUI system.

uGUI vs UI Toolkit

uGUI (Existing “Unity UI”)

Driven by game objects
Crowded hierarchy
Layout driven by coordinates and anchors
Requires rasterized art assets
Hard to visually edit without an artist's intervention
Lots of optimization required
Heavy duty and more flexible in advanced rendering
Comparable to basic 2D game development in Unity:  Gameobject + Assortment of components + Materials + C#

UI Toolkit

Driven by documents (UXML, USS)
Clean hierarchy
Layout driven by flexbox
Rasterized art assets are optional depending on the art style
Easy to visually edit without artist intervention
Little to no optimization is required
Lightweight, but limited in advanced rendering
Comparable to basic web front-end stack:  HTML + CSS + JS  ~= UXML + USS + C#

Just like how the raw web stack (HTML/CSS/JS) requires some form of framework to build systemic UIs, Unity's UI Toolkit requires the same. Where applicable, certain parts of the UI will still be hard-coded. Deciding where to apply which solution was a matter of a balance between building dynamic and robust UI vs. not over-complicating things that didn't need to be.

Overlays

The overlays are one of the most dynamic components present in the UI framework. If we look at the intended use of the overlays, they are meant to be containers that hold an undetermined amount of UI components that are meant to display information and/or provide interaction points. There will be many instances of these overlays present as they correspond to the truck's physical interactable points. For reference, there were estimated to be over 200 different points between the truck tractor and trailer that would need to be interacted with.

While there are over 200 potential UI overlays, there will be many overlaps in the information present in each one.

Authoring UI data in scriptable object form means that the data can exist in a source of truth without reliance on Unity Scenes or prefabs and their variants.

If we think of the tires on a truck, just the tractor by itself can have 10 tires (keep in mind the rear wheels are often in pairs of two on each side). So in that scenario, we have 10 different overlays providing the same information text, all the same interaction methods, but possibly different names (front driver-side tire, front passenger-side tire, etc.). This also means the author needs to define these tires 10 different times.

How do we solve this in terms of providing a robust system that spares the author the repetition of reentering identical data, avoiding creating redundant duplicate data while respecting that each overlay is a unique instance with its unique title and state.

A loose representation of the relationship between the data and components that drive the relationship between the truck assets, the UI, and the Camera

Defining zone combinations in the Unity editor

A look at the catalog of the Themed Elements that can be plugged into overlays

Utilizing the Button type Themed Elements in the multiple choice subsystem

An overview of some of the overlays one might have defined for a scenario

An overview of the UI components that the overlays would map to via UI Collider Keys

Sidebar

Responsive layout

Unity's UI Toolkit creates an interesting shift in how UI assets are created and stored. Whereas previously UI would almost exclusively be art assets consisting of bitmaps, now significant portions of the UI assets are represented as text (UXML/USS). Since I wanted our new UI Toolkit assets to be independent and generic, I wanted to avoid having direct references to icons (bitmaps) in actual markup language (UXML) and style sheets (USS). Creating that type of static relationship would create a dependency that would work against the desired generic nature of the assets. For that reason, any time a UI icon is used in the UI, it is serialized as a texture variable in the components setting in the Unity editor. This means the icons are dynamically placed into the UI at runtime and can be authored on per-project or even per-scene basis without altering the source files.

An overview of the UXML/USS structure driving the sidebar

Unity's UI Toolkit is still quite new. At the time I created this UI (and perhaps even at the time of writing this article), there were no clear definitions, examples, or established paradigms for using the new tools. Used my previous experience from web and mobile to think of this as an opportunity to create a generic and reusable library of style classes that would drive the layout, typography, and visual styling.

Demonstrating how the UI would respond to a width change. The buttons generated at runtime (not pictured here) would also resize their width in response.

The sidebar width was driven by one point value. In addition to the UI DPI, we would be able to adjust the width of the sidebar in mere moments as well. At runtime initialization, we could also look at the width of the sidebar and use that to drive the animation that hides the sidebar.

Demonstrating how the UI would respond to a change in screen resolution.

Putting together everything above we have:

Authoring the UI in an 8 pt grid, effectively making it DPI adjustment ready
Fully responsive layout and components
Final width (and associated animations) driven by a single variable that can be freely adjusted

These implementation decisions mean that the UI can easily be adjusted to fit the display size and resolution once the specs of the hardware are decided.

Touch Input System

As I covered in Part 1, the idea for the touch-based interactions would be that it would stick exclusively to traditional phone-style inputs. While this might initially seem simple, there are quite a few details to consider when dealing with processing touch inputs in the presence of gestures like dragging, pinching, and long-pressing. It's easy to take this for granted in the context of web and mobile since all this input post-processing occurs at the OS or front-end framework level. However, Unity does not have this inherent groundwork and merely takes in the raw input for you to figure out how to process.

An additional challenge came from the fact that the UI and the interactable 3D elements were present in the same screen space. This meant that the system needed to also be conscious of when the user was interacting with 3D space or UI space.

I don't have any exciting visual material to share in this area. Essentially I used the Unity New Input System to define touch-based actions that would then feed into my custom input manager where the input manager would assess for:

Is the input on UI (sidebar and overlays)
Is the input a tap, a hold, or a gesture

Then it would proceed to pass the right type of data to the right system (UI or Camera).

Finishing Thoughts

I hope you enjoyed reading through my journey building this product.

This project is very near and dear to my heart. I met some amazing people in this experience who both trusted me and enabled me to apply myself and build something I could be proud of.

Sadly the entire dev team working on this Truck Simulator was laid off alongside myself and it might not ever see the light of day.

This entry in my portfolio is the closest I will come to sharing this project with the world. Thank you for taking the time to give it a read. 🤗

Commercial Vehicle Simulator - Part 2