How do you move your video?
Posted on Jul 15, 2024 by FEED Staff
Almost all media viewed in 2024 is encoded via a small variation of technologies. But the landscape is shifting, with new innovations re-shaping the ways video gets from point A to point B
Words by Phil Rhodes
Video codecs have now existed for long enough that they’re starting to establish traditions. It’s an odd thought for a technology that barely existed during the early careers of now-senior people, but it’s true. The majority of videos watched in 2024 are encoded using one of a few key technologies. Recently, though, new ideas have provoked changes in the way we get video from place to place, changing how we think about bandwidth, portability and the very fundamentals of electronic devices.
Lessons in LCEVC
“Let’s not forget that traditional codecs were designed for SD video,” points out Guido Meardi. As one of the founders of V-Nova, Meardi’s experience goes back to the nineties – a time when some of those traditions were yet to be established. “Most codecs are good at compacting low-resolution signals,” he adds, “but not so much the fine details. We did something original. If you want, it’s a typical story where you have an idea which has already existed for decades. But, as often happens in life, it was too early. At the time, it couldn’t work.”
The intervening decades have seen CPUs become a little sturdier – but Meardi’s work currently centres on a codec which carefully balances performance and complexity. V-Nova actually calls it the low-complexity enhancement video coding (LCEVC), which is designed to work as a layer on top of conventional codecs such as H.264 or H.265.
It seems likely that early deployments will be in television broadcasting. It’s a field which likes to create the option for more capable receivers to decode improved pictures, but in a broader sense, LCEVC is talismanic of a world that perpetually wants to fit bigger pictures down smaller pipes.
Perhaps most importantly, it wants to do that without exhausting either the computational or battery resources of anyone’s cellphone, so V-Nova’s engineers took care to build their enhancement layer around modern, multi-core computer hardware. “Legacy compression had this idea of dividing the image into blocks,” Meardi notes. “If you zigzag across an image from top left to bottom right, you’re forcing sequential operations into the process.” LCEVC, however, is built to do several things at once. “I have something lean, mean and very efficient at compressing dots and lines because that’s what it is.”
Viewed alone, the dots and lines of the LCEVC enhancement layer resemble an edge-detection filter. Traditional techniques – which work on image blocks around eight pixels square and often larger – don’t compress that type of data well. Meardi emphasises the importance of understanding the nature of the data. “Once you understand what you are compressing, it’s easy to compress with very small transforms. We used some sophisticated neural networks to understand the limits, discovering you can indeed be 10% more efficient than LCEVC. But at that point, do you care if it costs you five times the processing power?”
That question seems to be the core concern of codec engineers – and in some ways, always has been. Concern over computing power has provoked the adoption and retirement of whole techniques –for example, the vector compression used in designs like Cinepak in the early nineties. Despite the light weight of the codec’s mathematics (and the power of modern hardware), rolling out LCEVC as a format for broadcasting to consumers meant carefully analysing what hardware the average set-top box has to offer.
Meardi describes most of them as ‘designed to do H.264, H.265, VVC, but it’s possible to leverage hardware blocks of very low-power systems-on-a-chip (SoC) which haven’t yet implemented LCEVC in silicon’. “All video SoCs come with scalers, overlay hardware and hardware blocks which can be repurposed to implement LCEVC,” he continues. “We always start from mapping every hardware block they have, including the ones they forgot about. Then, we worked with Nvidia and presented a showcase together.”
Attention to that level of detail, Meardi concludes, is sometimes new to people doing advanced theoretical work on video compression. “Many people, including companies such as Huawei and others, agree.
Several scientists working on codecs weren’t considering that compression needs to happen in real time. You can design something which reaches the Shannon optimum [a mathematical performance limit], but it doesn’t matter. The principle we had from the very beginning in our company is if something can’t be done in real time, it’s out.”
Maintaining latency
Away from the world of distribution, different fields clearly have varying expectations. Ciro Noronha is chief technical officer at Cobalt Digital, a company deeply embedded in the infrastructure on which broadcast production relies.
The world of TV contribution is essentially the last bastion of entirely uncompressed video. “You have a continuum of solutions which start at high compression, middle compression and then uncompressed,” Noronha begins. “If you have plenty of bandwidth and also want the pure, perfect and pristine image, you pay for that with a very high bit rate. It’s 1.5Gbps HD – although you have almost no latency.”
Latency is a key concern in live production, where the timing of cuts between shots relies on real-time decisions rather than a pre-edited timeline. Historically, codecs have used similarities between frames, implying they must have a delay of at least a few frames – making that sort of snappiness tricky. Recent developments, Noronha says, have found ways to both have your cake and eat it: “There’s mezzanine compression like JPEG XS, which offers almost the same latency as baseband uncompressed, but will compress it between a factor of four and ten. While there is a little degradation in quality, the latency remains outstanding.”
This style of approach, Noronha estimates, is likely to become more common both in the high end of broadcast production and even as overall progress in information technology makes uncompressed workflows easier to handle. “I have a gigabit at home, symmetrical, which was unthinkable a few years ago. However, I doubt everything is eventually going to be uncompressed SMPTE 2110.”
A reason for this counterintuitive reality is that compression makes for some significant conveniences, as with remote production. That has sometimes meant replacing an expensive satellite link with the public internet, although it increasingly means moving the director, vision mixer and the rest of the gallery to a neighbouring time zone.
Delay, again, is the bugbear, as Noronha happily acknowledges. “People have been avoiding codecs for remote production because of latency, but you can get the same class of latency with JPEG XS or HEVC, at a much lower bit rate. So, it became a viable solution.”
Finding these solutions to latency requires painstaking engineering. Contrary to popular preconception, Noronha remarks: “You can get sub-frame latency, glass-to-glass in less than a frame. Cobalt has demonstrated the solution and will be shipping it in a few months, so it’s possible. You take a hit on bit rate; for an HD signal, you need around 30Mbps, but it’s not 1.5Gb
or 600Mb, it’s 30.”
Making that happen requires some under-the-hood knowledge of how codecs do their magic. Many use those similarities between nearby frames to improve performance. Bidirectional – or B-frames – can refer to frames from the past or the future, but that usually means a return to that multi-frame delay. P-frames are predictive only. “For HEVC, you just don’t do B-frames,” Noronha explains. “You don’t look in the future; it’s perfectly fine to look in the past. There’s a technique for MPEG-2 called gradual decoder refresh. You can send [data] without waiting for the whole picture.”
Despite all this cleverness – on top of the massive gains made since the nineties deployment of MPEG-2 – Noronha sees the writing on the wall for big gains in codec performance. “We are reaching a place of diminishing returns. The next-generation codecs really shine if the resolution is high. If you are using smaller resolutions, the gains are not that great, maybe 20-50% – the same quality at half the bit rate. The improvement from MPEG-2 to H.264 to H.265 is not going to happen any more. There is a fundamental limit.”
Thinking small
Either way, a codec might be designed for production work, distribution to consumers or all manner of things. One unusual approach is to implement image compression on a sensor itself. Jean-Baptiste Lorent is director of marketing and sales at Intopix, the company behind Tico, a product that’s quite literally named after the idea of being a tiny codec using minimal resources.
“The company was created in 2006 as a spinoff from UCLouvain,” Lorent explains. “We came from the labs, which have strong expertise in image processing. We have 15 years of microelectronics optimisation for a field-programmable gate array, as well as a team doing optimisation for GPU and CPU platforms. You need the right people coming together.”
Despite the company’s academic beginnings, the team took care to keep the practicalities of electronics in mind. Lorent continues: “We learnt a lot from JPEG 2000, which was more a codec created by the research community and less by people connected to the industry, the markets and similar areas.”
Intopix began developing Tico just a few years later, in the early 2010s. It quickly became clear there was room in the world for a very lightweight codec, and the design was rapidly adopted as a SMPTE standard. “Intopix has always been an active contributor on the JPEG committee,” Lorent points out. “You have two main groups – JPEG and MPEG – and at JPEG, we discussed it with the chairman at that time. He questioned, ‘Why is Intopix not launching a standard for lightweight technologies?’” The result was JPEG XS, which Lorent describes as ‘an evolution of Tico’.
The benefits of Tico’s tininess are not only in power consumption but also latency, as Lorent describes it. “As an example, with Tico Raw, if I have a quad-core CPU then I can encode or decode 8K at 60fps, at 10:1, 12:1, the sort of compression ratio you find with ProRes or other codecs,” he states.
Meanwhile, the advantage of the codec’s computational simplicity is that it can be implemented at the very start of an imaging chain, with the potential for improved efficiency in every downstream component – as well as other enhancements in unexpected ways.
“Intopix innovates in sensor compression,” Lorent remarks. “We license Tico Raw to Nikon. By integrating Tico Raw compression into a sensor with four parallel interfaces connecting to the processor, we can reduce the number of interfaces from four to just one.” With the lion’s share of power consumption in many sensors created by the need to drive some very fast external data interfaces, power consumption falls and the hardware platform can be simpler and cheaper.
Beyond photography, film and TV, new compression ideas have relevance to AI and machine-learning applications. Systems may even be able to work on the compressed data, removing the need to unpack it at all. “Intopix is active and also involved in the automotive market,” Lorent adds. “Clearly, the amount of data is huge if you consider the number of video sensors. We believe more and more processes will occur in the compressed (or partially compressed) domain for machine learning – but the road is still long.”
With many fields still clinging to uncompressed video because of latency and quality, Lorent accepts that ‘people need to change their minds, and the only reason they would change is if they could say nothing against the latency or quality of a codec’. “If you want to replace uncompressed, then complexity and power consumption are key. We start from the people who don’t want to hear about compression. If we can create a codec which is low power, low latency, not even millisecond but microsecond latency and without loss in quality, we’ll be fine.”
This feature was first published in the Summer 2024 issue of FEED.