Augmented, virtual and mixed reality experiences (AR/VR/MR) are currently increasing in popularity, as it enables users to navigate in multisensory 3D media experiences. The interest of capturing the real world in multiple dimensions and presenting it to the user has never been higher. However, such technology requires enormous amounts of data, so it is necessary to improve compression quality and signal processing.
How does AR/VR/MR work?
To create a 3D scene, volumetric visual data is used to describe the geometry of the scene and the objects it includes, and attributes such as color, opacity, etc. Temporal information is described by individual capture instances (think frames in 2D), or other means (position of an object as a function of time e.g.).
What is Point Cloud Compression?
Volumetric visual data is typically computer-generated or is captured from the real world. Point cloud is a common representation format for this data, the other being polygonal mesh. Point Cloud Compression (PCC) is thus the way of compressing volumetric visual data.
A point cloud is a set of individual 3D points, each point having a 3D position but also being able to contain some other attributes such as color, surface normal, etc. Point clouds are more flexible than polygonal mesh when representing non-manifold geometry and could be processed in real-time.
3D point cloud data can be applied to many fields, such as cultural heritage, immersive videos, navigation, etc. Due to the wide range of applications, the MPEG PCC standardization activity had to generate three categories of point cloud test data: static (many details, millions to billions of points, colors), dynamic (less point locations, with temporal information) and dynamically acquired ( millions to billions of points, colors, surface normal and reflectance properties attributes).
The standard we are currently developing is a point cloud comprising a list of 3 points coordinate (x, y, z) along with reflectance and RGB attributes associated with each point.
What is MPEG?
MPEG, short for Moving Picture Experts Group, is one of the main standardization groups dealing with multimedia. Its goal is to build an open standard for compactly representing 3D point clouds.
Short story
In 2014, the MPEG 3D Graphics Coding group (3DG) started to study how to adapt their tools to advanced immersive applications. However, the standards developed by the 3DG were made for computer animated content, which usualy deals with sparse geometric content and a limited amount of noise, whereas in real-time systems, point clouds are dense and noise can’t be ignored. Consequently, these standards were not suited for the situation, hence the need to find other standards.
nA call for proposals (CfP), developed in close cooperation with stakeholders (major mobile devices manufacturers, leading startups, etc.), was published in January 2017.
After the results, three different technologies were chosen as test models for the three different categories targeted:
- LIDAR point cloud compression (L-PCC) for dynamically acquired data
- Surface point cloud compression for (S-PCC) for static point cloud data
- Video-based point cloud compression (V-PCC) for dynamic content
The final standard is to be published early 2020 and will consist in two classes of solutions.
- Video-based, equivalent to V-PCC, appropriate for point sets with a relatively uniform distribution of points.
- Geometry-based (G-PCC), equivalent to the combination of L-PCC and S-PCC, appropriate for more sparse distributions.