DEV Community

Cover image for Unity AR capabilities overview
Art Sh
Art Sh

Posted on • Updated on

Unity AR capabilities overview

To start developing AR, Unity recommends using AR Foundation to build your application for Unity supported handhelds AR and portable AR devices.

AR Foundation enables cross-platform work with augmented reality platforms in Unity. This package provides an interface for Unity developers, but does not implement AR functions itself.

To use AR Foundation on a device, you must also download and install packages for each of the target platforms supported by Unity:
For Android : ARCore XR Plug-in
For iOS: ARKit XR Plug-in
For Magic Leap: Magic Leap XR Plug-in
For HoloLens: Windows XR Plug-in

Image description

With Apple coming to the market with a future mixed reality headset, probably Unity will extend supported platforms quite fast to gain developer audience and extend capabilities offering on Unity platforms.

For instructions on how to configure your Project using the XR Plug-in Management system, see the Configuring your unity Project for XR page.

AR platform support

AR Foundation doesn't implement AR functions from scratch, but instead defines a cross-platform API that allows developers to work with functions common to multiple platforms.

Image description

AR Foundation capabilities

This is a great overview of AR capabilities and tutorial to get started in Unity AR.

AR foundation supports following capabilities within the platform:


Commonly used to determine where virtual content will appear, where a ray (defined by an origin and direction) intersects with a real-world feature detected and/or tracked by the AR device. Unity has built-in functions that allow you to use raycasting in your AR app.

Plane detection

Detect the size and location of horizontal and vertical surfaces (e.g. coffee table, walls). These surfaces are called “planes”.

Reference points

Track the positions of planes and feature points over time.

Participant tracking

Track the position and orientation of other devices in a shared AR session.


Recognize gestures as input events based on human hands.

Point cloud detection

Detect visually distinct features in the captured camera image and use these points to understand where the device is relative to the world around it.

Face tracking

Access face landmarks, a mesh representation of detected faces, and blend shape information, which can feed into a facial animation rig. The Face Manager configures devices for face tracking and creates GameObjects for each detected face.

2D and 3D body tracking

Provides 2D (screen-space) or 3D (world-space) representations of humans recognized in the camera frame. For 2D detection, humans are represented by a hierarchy of seventeen joints with screen-space coordinates. For 3D detection, humans are represented by a hierarchy of ninety-three joints with world-space transforms.

2D image tracking

Detect specific 2D images in the environment. The Tracked Image Manager automatically creates GameObjects that represent all recognized images. You can change an AR application based on the presence of specific images.

3D object tracking

Import digital representations of real-world objects into your Unity application and detect them in the environment. The Tracked Object Manager creates GameObjects for each detected physical object to enable applications to change based on the presence of specific real-world objects.

Environment probes

Detect lighting and color information in specific areas of the environment, which helps enable 3D content to blend seamlessly with the surroundings. The Environment Probe Manager uses this information to automatically create cubemaps in Unity.


Generate triangle meshes that correspond to the physical space, expanding the ability to interact with representations of the physical environment and/or visually overlay the details on it.

Human segmentation

The Human Body Subsystem provides apps with human stencil and depth segmentation images. The stencil segmentation image identifies, for each pixel, whether the pixel contains a person. The depth segmentation image consists of an estimated distance from the device for each pixel that correlates to a recognized human. Using these segmentation images together allows for rendered 3D content to be realistically occluded by real-world humans.


Apply distance to objects in the physical world to rendered 3D content, which achieves a realistic blending of physical and virtual objects.

Code and how to get started

If you are looking to get started with the Unity platform and want to get started with code.
Here is a great Github repo with a detailed overview and some well-structured code samples:

Top comments (3)

kungfukitty profile image
Kayla Anderson

I’ve only studied basic developing but Id love to play around with some AR tech one day!

kungfukitty profile image
Kayla Anderson

I have an iPhone so I’m pretty excited to see what feature they come out with for Apple

kungfukitty profile image
Kayla Anderson

With how revolutionary the Pokémon game was I can only imagine how much more revolutionary AR will become in the next few years especially with games and daily tasks