overlay text annotation on an independent video track

I have 25 short videos of family (3 min each) that I would like to label with…

overlay text annotation on an independent video track

I have 25 short videos of family (3 min each) that I would like to label with names of the various people.

There can of course be many people on the screen at once, so I need to be able to add multiple labels (maybe with a box around the person and the name below, I'll have to see what it looks like).

The labels also need to move with the person they're labeling, so AI object tracking would be nice, but that sounds complex/expensive so I can do that manually if need be.

This will make it easy to see who's who, but it would ruin the video for casual viewing.

So I also want to be able to turn this layer/track off, just like subtitles.

This is inspired by Flickr's note feature, but that's just for static images.

Can I do this with a standard format like MP4? I'd like to avoid a proprietary format or website if possible, since this is for genealogy and will be archived.

HOW do I do this? I'm even having trouble figuring out how to search for what I want, since I'm not a video person and I don't really know the right terms to use.

I'd appreciate a description of what this process is called, and then what tools I'd need to do it.

Hopefully free tools, although I can get free access to iMovie, Pixelmator Pro and Adobe Creative Cloud.

It would be awesome if this could just be a text file I set next to the video, like an srt, which would hold the various screen locations, strings and time stamps.

I'd be willing to sacrifice a lot in the looks department to get something simple like this.

But I don't know if it exists.

Question from user Nick K9 at stackexchange.

Answer:

I have learned that subtitles are WAY more flexible than I understood.

Using the free Aegisub tool, you can create multiple, overlapping subtitle lines synced up to video or audio.

And you also have control of the font, size, style and position of the text.

So while it would still be a pain to use this to have labels literally follow people moving around the screen, you can easily place temporary text next to the person the first time they show up, and let the viewer do the “tracking” themselves.

And the best part is that the subtitle files are all just UTF-8 text, so they're small and easily to edit as needed.

Answer from user Nick K9 at stackexchange.

overlay text annotation on an independent video track