In short how they did it.
1. Take one camera shot, save image and video {or use for testing one working live photo}.
2. Open a local mov {must be HEVC coded, old format of mov as MP4 will not work at all). You can use your old videos but generate new one with HEVC coder.
3. Mix metadata tracks from saved video in step 1 {I think they use same metadata in all videos but is nice to have your own} with video content from step 2 and add unique identifier to this new video. Keep in mind the mixing track must have the same duration of video track saved in step 1. As a bonus you will figure out how many frames / s your video in step 2 must have to animate as same speed as that from step 1.
4. Write a heic file with multiple frames {seems 1 frame will not work at all} , add unique identifier
5. Save both video in step 3 and image in step 4 to Photos pairing them.
Will not go into details how those steps must be done, but programmers will figure out how is done.
For all those to work in same device even if you download them multiple time, maybe write again unique identifier to both files and generate new files {different name, date of creation, metadata simple to write}.
Enjoy