NAB AI Reality Check - MAM Edition

(~4-5 min read)

As we roll into NAB, the hype of the year becomes more clear - AI. And whether it’s like HDTV (hyped 5 years early), or 3DTV (hyped never to be), don’t stop a good trade show trend will ya?

“We might like to check out some AI tagging.”

I’ve heard customers mention this on almost every new project for the last six (6) years and almost none have implemented it in a meaningful or scaled-up way. But to be fair, I would blame video MAM vendors for not implementing AI in a usable way, more than the tech itself or customer desire. Note I said video MAM, not DAM. Photo AI/ML has been useful for years now since you’re tagging keywords and captions on a single frame. This is very different from time series data such as you would collect on a video.

“What do you mean?” Well, if I run facial recognition on an image, I get the person or persons in the image in an easy-to-search field. If I run the same service on a video I might get a hundred markers naming the same person every time they are on screen. This is largely useless and annoying in a search, as most anyone would admit.

Also, if you search for a marker or word, you get a list of clips in most MAMs. This is painful to use for content discovery as you have to search, get results, and then look inside each clip to see if it contains what you actually searched for. Don’t believe me? Set a unique marker in your MAM and search for it. You’ll get a clip, not markers on your results page. Now amplify this times a thousand and most users simply give up.

Until MAM/DAM vendors start to handle this better and serve the real-world needs of the user instead of just checking the box of “we do AI,” no one will widely adopt this. (Check out the Akomi Unscripted video to see how we solve this.)

“Are there current AI/ML services I can implement effectively?” Yes, we believe transcription is the fastest, simplest, and least expensive way to add metadata to video with speech, even with the limitations of current MAMs. You can start small, running masters and interviews, or maybe finally run that bucket of archive media that’s sat for years with no one knowing what it is. And If you’re running tens of thousands of hours you can set it up on your own cloud, or on-premise now. (NSA NAB ‘24 leak there.)

Now you may say, “Isn’t this self-serving?" NSA's Akomi heavily features a unique transcription search UI, so of course you’d say that.” But here’s the thing. We don’t believe this because we engineered it into Akomi. We engineered it into Akomi because we saw real-world results. Two years ago we had a customer who did this at scale, (4,000 hours of interviews for a research project.) And we have a ton more doing it in production now. Costs are low (starting at around $10 per hour and dropping drastically as you scale up.) The aforementioned on-premise solutions and new cloud offerings will drop the bottom even further this year.

“Ok so what’s coming next?”

AI is already evolving. At NAB you will start to see demos of vector search, conversational search, and other generative AI. We’re moving past “tagging” into “understanding” data. Stay tuned (and book a demo with us.) But don’t wait for the hype demo to ship;  make a plan and move now on what you can.

“Ok, what do I do today?”

1. Set up AI/ML transcription services on video which contains speech (interviews, master outputs of film/TV, product videos, etc.). This is the fastest way to add detailed search-ability.

2. Implement image tagging in your MAM/DAM now. It’s cheap and yields good results.

3. For the rapidly approaching future, start to make sure your media is in a MAM. And then make sure your MAM/DAM is ready to deliver a user experience that makes the data you’re extracting easy and effective to use. 

Even if you don’t use these solutions at scale, you’ll be ready when you need it. But start doing something now. Because the main thing we’ve seen in the last 12 months is that AI is not waiting on the Film & TV industry.

Next
Next

Choosing initial MAM Deployment Services