Ivo Vacca Blog

Four AI tools that will accelerate your Broadcast TV Post Production.

Jun 23, 2020

Link to the original Medium Post

If you work in the TV business, for sure you had to face with huge amount of work behind the post production. TV Shows usually don’t have particular complex pipeline to manage, compared a VFX post production. The amount of VFX in a daily TV show is almost tend to zero. Despite the CGI world, television has to deal with a enormous quantity of video needs to watch, log, cut, mix, color correct, quality control check, and lots of other things that the production house must have to do before delivery to the broadcast network; with a very short deadline. Lots of this operations are repetitive and in this article I’ll give you an overview about five tools that absolutely deserve attention in this industry field and can improve you workflow in post production, above all, referencing to unscripted tv shows.
We can all see the huge power of AI*, at the moment we are surrounded by a multitude apps that we use everyday. One of the biggest step AI did in the last decade is about analyzing images and since video is a sequence of images… lets see where TV stands.

1. Facial Recognition

Facial Recognition is one of most intriguing tool, you can automatically manage the clips based on the people faces in the shot. This possibility could really speed the workflow, especially if we are talking about a post production based on a very large footage and/or there are lots of contestants and you are looking for a specific one, for instance in a close up. Close Up shots are particularly difficult to log, because very rarely the director has the possibility to dedicate a camera for each contestants. The consequence is that human logs can’t report this kind of shots in a proper manner. Timecodes are most of the time vague, or the annotation is not what it was filmed. Especially if it’s an unscripted tv show production. DaVinci Resolve 16 Neural Engine uses state of the art deep neural networks and learning, along with artificial intelligence to power new features such as speed warp motion estimation for retiming, super scale for up-scaling footage, auto color and color matching, facial recognition. Apparently DaVinci Resolve 16 is the first commercial software that introduced facial recognition, It is enabled in the Studio Version.

2. Synthetic media

Synthetic media are contents generated by AI. There are lots of apps that can generate this particular media. One in particular it is an outstanding one: Synthesia. They are specialize in two core areas, translation and personalization. I’m going to talk the first one.
With Synthesia you record ones and then you can translate the video in all the languages you want. Of course, the perfect game for this technology is video marketing but it could be perfect for a production where the host leads the show a part from the contestants, or it could be very useful in a docufiction. It could be also indicate for news or similar contents, where it is necessary the translation of the contents. In the dubbing department, synthetic media is going to have a really strong impact.
3. Speech To Text Transcription powered by AI This is a very handy tool. I find this tool absolutely useful, and right now the market is offering very good software with a very interesting price per minute. Here you can find an exhaustive article about this topic. Below there is a short list about the tools could be a good match for a television production, but really, the market nowadays covers pretty well all the businesses need from video marketing to television.
Rev’s Automated Transcription Software, apparently, is one of the most popular transcription software. You can expect 80% of accuracy with transcription powered by AI.
Google-Speech-To-Text uses an API powered by Google’s AI technologies. A very cool features, still in beta, it is the possibility to analyze up to five speakers recorded at the same time, from multichannel or just downmixed. And you can do it in a tons of different languages. Watson-Speech-To-Text uses the well known Artificial Intelligence by IBM. It uses deep-learning AI algorithms to apply knowledge about grammar, language structure, and audio/voice signal composition to create customizable speech recognition for optimal text transcription. For pricing and policy check the links above.

4. Video Classification

Video Classification is a video process machine learning based, where you can analyze the video contents and with the data that you have back after the process, you’re giving kind of super power to your contents. For example you can moderate inappropriate content, find the brands in the videos, path the movement of athletes during a game to identify plays for post-game analysis, labeling the footage and much more. Video classification is a huge argument to play with it. Setting a TV company server with the help of this technology it will be a winning strategy today without any doubt.
Amazon offers this service and it is connected with an AWS account. This kind of service is offered also by Google and IBM, plus lots of small and medium start up. *Under the AI, in nowadays, forum and blogs refer a very vast field. If you have no idea what AI is, please take a look to this article from nvidia about the difference between Artificial intelligence, machine learning and deep learning.

About Me

Image

My history and professional background is based on continues crossover works in a range of positions in TV and film industry, that allowed me to build on my strengths and develop unique professional skills. For the last two decades I have been working for major cinema production houses and broadcast channels, directed two international docs-fictions and produced one documentary. Edited three feature length movies. Setup and supervised the postproduction for some of the major international TV formats in Italy.

Ivo Vacca

The slaves of the 21st century: digital illiterates.

Jul 9, 2018

Link to the original Medium Post

Premise

There is a big confusion and a wrong interpretation about the term ‘digital literacy’.

American Heritage dictionary’s definition:
literate
One who can read and write.
Repeat ‘One who can read and write.’
The fact that a person knows how to use Word or post a photograph on Instagram, does not prove, at least does not certify, any digital literacy. Knowing how to write and read the language of our times (the computer language) is a different game. And this is the big problem. Almost all of humanity is unable to write and read any kind of computer language on which today’s world is based. The work of robots that work 24/24 on economic transactions, information management, online reservations and endless other activities concerning our daily choices remains an inaccessible and incomprehensible world for almost everyone. Almost all the world’s population needs an interpreter for any basic variation of a program, someone who translates our needs into digital forms.
Photo by Kevin on Unsplash
Ioannes 1:1
In principio erat Verbum,
et Verbum erat apud Deum,
et Deus erat Verbum.

The automation

What is happening today is exactly the same that happened in the Middle Ages. When the power was in the hands of the literate people. The faithful during the masses recited the prayers without having any idea of what they were repeating. The big part of the narrative structure (if you prefere the old school storytelling) was relegated to the iconographic sequences, illustrating The Passion, Ascension Day, etc. Let’s make a comparison with our digital age, if you think about it, it is not very far from what is happening nowadays. Pre-set operations (templates) are mostly performed, made (written) by people who know the computer language and relegate your possibility of interaction, within a world defined by a small group of programmers. In other words they create, invent, interpret your needs and realize user interfaces (UI) so that you can understand and carry out your actions.
We use icons, buttons, images, just like children. And how children behave: when a software does not have a simple UI, we discard it. On the other hand we could not do otherwise, for the simple fact that we can not implement absolutely nothing in the code, being digital illiterates.
Nothing has changed, the centuries pass, but the power still remains in the same hands: who is literate. In this case, who knows the new Verb. This also explains the great difficulty of the intellectual world to formalize the current historical period. For the first time in history (since the existance of the modern academic world), the ownership of language that defines power has shifted. Decentralized in the ether, scattered in the network.

The power of the code

In less than twenty years, the world economic power has concentrated in a few companies, all of them having coding in the core of their business. For instance, Amazon in 2018 has achived 800 billion dollars, meanwhile Apple is preparing to become the first 1.000 billion dollars company. Italy’s GDP is estimated around 2.100 billion Euros. You can draw your own conclusions.
Billion of users post social contents everyday, generating a massive ads volume. So massive to leave only the crumbs to the oldest 4th and 5th power, press and broadcast. In all this human effort to participate and appear in the virtual world, there are very few of us earning money while at it.
Programming languages, more and more advanced and high-level (high level means more similar to the human language), prove to be much more than just automation model. The semantic structure is more and more closer to human language, and the algorithms are taking the forms of a common script.

Aesthetics and code

Recently I witnessed a discussion between two programmers, they talked about classic figures of rhetoric and more elegant forms to solve a given problem. In a nutshell, they were speaking of ‘the beauty’ and more generally, of the different aesthetic forms within an algorithm, as if they were talking about a work of art; how, even if less efficient, the particular set of coding instructions could be closer to his philosophy as an independent programmer than that diktated by the market. Is there a new kind of romanticism among programmers?
Joking aside, any language follows a precise semantics and a syntax, and the more they are rigid, the more ambiguous interpretations are limited.
With the implementation of machine learning, artificial intelligence, deep learning (and of course their relative training), “interpretation” enters the computer world, and suddenly we find ourselves interfacing with a stochastic model that increasingly looks like the human one. A little bit like what Google introduced with Google Duplex. The impossibility of distinguishing a machine from a human being (circumscribed in a specific topic) is today a reality and all this reveals exciting but equally unpredictable scenarios.
Of course we are still far from an AGI (Artificial General Intelligence) but now it is increasingly clear and likely that sooner or later we will come to create an artificial form of consciousness.
For these reasons, it is urgent to make information technology a humanistic matter, not to confuse a language and its power with God. As was the case in the Middle Ages. If we talk about basic literacy, nowadays we need to include and study the basics of computer language from primary school. A fairer and just society can only be born on the basis of computer literacy for all, which implicitly includes the ability to write and read. Who writes code orders the machine what to do, who is not able to write code takes orders from the machine. It is clear that digital illiteracy is the new form of slavery.
A slavery to which one not only can not resist but does not even know the words to identify what the problem is.
There is only one priority to which the academic system has to respond urgently: to make digital literacy an obligation. An obligation to learn how to write and read code. No matter what coding language is chosen, dozens of languages currently follow more or less the same rules. The important thing is to guarantee the tools to be able to read and understand what happens in a program and make it possible to develop as soon as possible “popular” computer languages, a sort of ‘vulgar’ programming code that will help the process of democratization of programming.
Because today our freedom is written with algorithms and makes a slave who does not understand them.

The jump cut. Méliès lives on Youtube.

Re-edited March 2, 2019

Like lots of people know, the jump cut has origin at the end of '800. In “Le Magicien”, George Méliès used to edit his movie with it. Today a film like that probably could take more than one smile, comparing it with our nowadays special effects, but it's important to remember why and how Méliès change forever the way to tell a story. Why after Méliès the art of “cutting scenes” will never be the same. The legend says Méliès discovered the effect of a jump cut after he made a mistake.
In the meanwhile was passing a carriege, accidentaly interrupted the shooting and restarted the camera. Once he had the enveloped film in his hands, with the big surprised he noticed that something strange happened whne the mistake occured. Something magic.
During the temporal ellipse, the carrige had the time to go outside the frame and for coincidance in the next frame there was a hearse. A cut that definetely doesn't go annoticed.
The truth of the matter is that we don't know exactly which the real story how the jump cut was born. For sure we know the story of his success.

JUMP
The true of the matter is the Jump Cut it does not go unnoticed. And this does not go unnoticed that made it famous in the world of auteur cinema. This punch in the eye, representing a temporal interruption. Or rather, a rift with the past. But its use in the history of language, first visual, then audiovisual, is much more fascinating than a mere trick to create apparitions and disappearances. In the evolution of filmic grammar, its meaning has acquired a well-defined style, embodying a current of thought such as the Nouvelle Vague. Mostly it was used to remind the viewer that he was in front of a representation, trying to prevent his identification with the story as much as possible to stay focused on the message that the author wanted to give. As the Breachtian theater passes through the emotional shear of the effect of estrangement (alienation, Verfremdungseffekt), so the Nouvelle Vague passes through that cut that jumps continuously within the same shot, same plan, same subject, leaving the meaning as the undisputed protagonist. But what happened to the jump cut ? The rebellious history of the visually more "uncomfortable" cut of cinema continues to find space in the new generations. In my opinion, it is no coincidence that the use and abuse of this style of editing has become the stylistic code of videos on Youtube. If you think about it, the Vlog (video blog) is an extremism of the concept of estrangement. There is only history. All the "superfluous" is cut. What remains is a dry, essential concept, without downtime. In summary it is an A-Rool without B-Rool. Nude of an aesthetic that distracts from the content that a youtuber wants to communicate.
This type of editing is the most used by teenagers all over the world, to say their own on the internet, without filters, conditioning, parallel montages, fades, etc. Only sharp cuts of the parts, according to the author, boring or useless. So it was until a few years ago. Currently even the less experienced begin to create videos more and more complex from the stylistic point of view. In some respects they become visually bourgeois, manifesting a more complex aesthetic, where sometimes it enriches, sometimes it distracts.
This need to be so incisive and immediate in telling something, comes from the need to be able to capture an increasingly frenetic audience. A spectator who must be captured in the first ten seconds. Two punches of time to be able to win the trust of those who look at us, deserving, who knows, his loyalty. Of course, at first glance, the use of jump cut today is no longer as noble as that of previous intellectual movements. But I'm not so sure. I believe that the motivations that pushed Godard or Truffaut, to cite only the best known, are not so far from the motivations that push a little boy from a place lost in the world, to take a camera in his hands and record what he has to say. An affirmation of himself in this world. A statistic, of course. But talking, that has something to say. A critical claim of the system, denied for too long by the broadcast system. While the Nouvelle Vaugue said enough to an artifact cinema, relegated only to entertainment, the vlog takes position and tries to say its own, on cinema, television, art, video games and sometimes politics. And most of the time he says it by choosing the most raw method that exists. The jump cut . No artistic poster. No dogma. A sequence of brutal cuts to tell who you are. Only this "uncomfortable" cut could take root so deep, as if it were the standard bearer of a break with the past.
I've always considered jump cut, more than a bad cut, a temporal trauma. The French were not wrong in considering it an openness to poetry. A room with a child, jump cut , a boy, jump cut , an adult, jump cut, an old, jump cut , the same room without anyone. Empty. Five deadlifts, the bow of a lifetime. Of course, you can use the fades, which would sweeten the whole thing. But do you want to put the innate rock of the jump cut ?

.: photos :: and :: stories :.