The realm of live-action 4D (3D + time) encompasses a vast swath of digitization potential, much of which is already being realized today: from simultaneous localization and mapping systems for autonomous robots and self-driving cars to stadium-sized systems for digitizing live sporting events.
At Digital Air we're interested in standards for a very small subset of that realm: digitizing live-action human bodies and faces in motion. This is because the human body and face are the two most important assets in storytelling. Initially, we're focussed on an even smaller subset of that: open format standards for the organization and description of live-action 4D human body and face data for multi-vendor workflows. We're interested in workflow standards because we operate at the start point of those workflows: image acquisition.
Multi-vendor workflows are important because specialization drives progress. 4D workflows are a series of specialized steps including photography, photogrammetry, pose estimation, autorigging, retopology, reanimation, compression, lighting, and rendering (to name a few), each of which can be broken down further into the evolving technologies upon which they are based: camera synchronization, GPU acceleration, machine learning, temporal compression, real-time rendering, etc.
Current solutions for digitizing photographically-derived live-action 4D human content fall into two principal categories: high resolution discrete sequential 3D models (useful for visual effects production where there is no limit on bandwidth) and low resolution temporally compressed streamable assets (useful for interactive devices where bandwidth is an issue). At each step of the workflows for both, best practice processes evolve out of close collaboration both within and across disciplines. In order to facilitate multi-vendor workflows, open standards are needed at each step in order for vendors to understand the data that they are receiving and to describe the data that they are generating: including format, resolution, compression, decompression, scale, origin, and camera parameters. Such basic workflow standards are needed for collaboration and competition to flourish and for multi-vendor workflows and solutions to answer the needs of clients today and in the future.
"Stage One" (workflows)
Despite the relatively high data density and complexity of 4D data compared with traditional 2D motion pictures, simple open standards, official or unofficial, that facilitate multi-vendor workflows leveraging existing formats, and extensions of existing formats, should be relatively easy to define. Doing so will in turn fuel opportunities for growth and collaboration across the specialized processes of 4D content creation, editing, and playback.
"Stage Two" (formats)
After "Stage One" open standards for workflows are established, "Stage Two" open standards for new formats for temporally compressed streamable, and eventually playable (rigged and reanimatable) 4D assets will also be needed. Animated textures derived from photogrammetry are expensive in terms of bandwidth. Multiple competitive solutions for temporal compression of both geometries and textures are needed so that natural market selection can promote the best solutions to their targeted platforms and use-cases. Vendors developing such solutions need content producers to deliver professionally and predictably shot and documented data for ingest into downstream compression and autorigging processes. Additionally, because of the amount of research and software that has already been developed with respect to retopology, compression and autorigging, "Stage Two" format standards will need to take into account the need for industry participants to bring their innovations to multi-vendor workflows while preserving their opportunity to be paid for their software. This will require an open community that respects, communicates and engages with one another. Getting to that point sooner than later is why I see the potential for relationships and collaboration across the industry resulting from "Stage One" open workflow standards, as described above, as the first step towards the more complex "Stage Two" open format standards needed for cross-platform, resolution-independent 4D temporal compression codecs and rigged playable assets.
"Stage Two" open standard format content needs to be as portable across platforms as JPEG images and MPEG-4 videos are today.
Where We Are Now
Significant work on compression and transmission standards for non-playable volumetric video is already well underway at MPEG with MPEG-I Part 5, V3C (Visual Volumetric Video-based Coding). Most of the work required for rigged, playable 4D asset open standards as it relates to machine vision and machine learning is also already at an advanced stage of development, particularly in academia. The question is not can machine vision bring real world 4D human performance-based assets into computer graphics syntax, but rather how fast and efficiently can the industry as a whole do so in a way that is portable across easy to use workflows, platforms, and devices? The answer to that question is directly correlated to the question of how long it will take to establish open standards and processes for collaboration that enable the entire community to engage in producing solutions.
Open Standards vs. Open Source
Open standards is not the same as open-source. It's important that IP that solves a specific part of a process can remain closed source and can be monetized to incentivize and reward its creation. If your software performs a useful function: from compression to retopology to autorigging to editing -- you should not have to own a film studio or a tech platform to monetize it. Open standards will expand the number participants in the industry and the number of customers for your product by defining how proprietary functions connect to the overall workflow. The resulting new and better solutions will make the overall technology more useful and expand the market for your product and everyone else's as well.
4D standards will naturally be an outgrowth of 3D standards, and 3D standards are already well established.
The Value of Collaboration
Computer graphics, like all technology development, is a collection of collaborative processes. Open standards play a critical role in how the disparate elements of those processes fit together. Standards for 4D will be needed before widespread adoption of 4D technology can take place. Until open 4D standards are established we will likely continue to see only the shoots and sprouts of all of the constructive things that will eventually be done routinely with 4D.
Digital Air has been awarded the first phase of an Epic MegaGrant to produce Rights Free 4D Human Datasets for Open Source 4D Research and Standards Development. The resulting datasets are intended to be useful for training machine learning systems for pose estimation as well as being useful for comparative results for different downstream workflows and codecs. Datasets allow researchers to design and test new methods and experiment with new technology on real-world data prior to making the capital investment necessary to record real-world data themselves. This is particularly important for software researchers and developers that have little interest in camera arrays other than the data that they produce. Our aim with the datasets is to facilitate research, collaboration, and conversation between industry participants and to help develop a framework for open standards for data recording and compression that includes provisions for existing and future IP rights in the space.
A discussion of the current status of the MegaGrant and the process of defining its contents and scope follows in the next blog post.
Update August 1, 2021
Based on the positive responses that I've received I've decided to launch Open4D™ as an independent standards group to meet and explore "Stage One" Workflow standards. I'll send an email out to participants and invitees when the site is live.
The Open4D™ "Stage One" Workflow effort will be limited in scope to open standards for data and metadata transfer between capture studios and ISVs, from talent contracts to transmission encoding. Other standards groups including ISO (MPEG) will define the open graphics formats and transmission encoding standards, including MPEG-I Part 5, V3C (Visual Volumetric Video-based Coding).
My previous updates will be rolled into the Open4D™ background discussion for anyone that missed them.