仅播放 AVMutableComposition() 的第一首曲目

Question

下方新建编辑

我已经引用了

但它没有提供我正在寻找的答案。

我有一个AVMutableComposition()。我正在尝试在此单一组合中应用单一类型 AVMediaTypeVideo 的多个 AVCompositionTrack。这是因为我使用了 2 个不同的 AVMediaTypeVideo 源，它们来自不同的 CGSize 和 AVAsset 的 preferredTransforms。

因此，应用指定 preferredTransforms 的唯一方法是在 2 个不同的轨道中提供它们。但是，无论出于何种原因，实际上只有第一首曲目会提供任何视频，几乎就好像第二首曲目从未存在过一样。

所以，我试过了

1) 使用 AVMutableVideoCompositionLayerInstruction 并应用 AVVideoComposition 和 AVAssetExportSession，效果不错，我仍在处理转换，但可行。但是视频的处理时间超过1分钟，这在我的情况下不适用。

2) 使用多个曲目，没有AVAssetExportSession并且相同类型的第二个曲目永远不会出现。现在，我可以将它们全部放在一个轨道上，但是所有视频都将与第一个视频具有相同的大小和 preferredTransform，我绝对不希望这样，因为它会在所有方面拉伸它们。

所以我的问题是，有可能吗

1) 在不使用 AVAssetExportSession 的情况下仅将指令应用于轨道？ //到目前为止的首选方式。

2) 减少导出时间？（我试过使用 PresetPassthrough 但如果你有一个 exporter.videoComposition 就不能使用它，这是我的说明所在的位置。这是我知道我唯一可以放置说明的地方，不确定我是否可以放置它们其他地方。

这是我的一些代码（没有导出器，因为我不需要在任何地方导出任何东西，只需在 AVMutableComposition 组合项目之后做一些事情。

func merge() {
    if let firstAsset = controller.firstAsset, secondAsset = self.asset {

        let mixComposition = AVMutableComposition()

        let firstTrack = mixComposition.addMutableTrackWithMediaType(AVMediaTypeVideo,
                                                                     preferredTrackID: Int32(kCMPersistentTrackID_Invalid))
        do {
            //Don't need now according to not being able to edit first 14seconds.

            if(CMTimeGetSeconds(startTime) == 0) {
                self.startTime = CMTime(seconds: 1/600, preferredTimescale: Int32(600))
            }
            try firstTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, CMTime(seconds: CMTimeGetSeconds(startTime), preferredTimescale: 600)),
                                           ofTrack: firstAsset.tracksWithMediaType(AVMediaTypeVideo)[0],
                                           atTime: kCMTimeZero)
        } catch _ {
            print("Failed to load first track")
        }


        //This secondTrack never appears, doesn't matter what is inside of here, like it is blank space in the video from startTime to endTime (rangeTime of secondTrack)
        let secondTrack = mixComposition.addMutableTrackWithMediaType(AVMediaTypeVideo,
                                                                     preferredTrackID: Int32(kCMPersistentTrackID_Invalid))
//            secondTrack.preferredTransform = self.asset.preferredTransform
        do {
            try secondTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, secondAsset.duration),
                                           ofTrack: secondAsset.tracksWithMediaType(AVMediaTypeVideo)[0],
                                           atTime: CMTime(seconds: CMTimeGetSeconds(startTime), preferredTimescale: 600))
        } catch _ {
            print("Failed to load second track")
        }

        //This part appears again, at endTime which is right after the 2nd track is suppose to end.
        do {
            try firstTrack.insertTimeRange(CMTimeRangeMake(CMTime(seconds: CMTimeGetSeconds(endTime), preferredTimescale: 600), firstAsset.duration-endTime),
                                           ofTrack: firstAsset.tracksWithMediaType(AVMediaTypeVideo)[0] ,
                                           atTime: CMTime(seconds: CMTimeGetSeconds(endTime), preferredTimescale: 600))
        } catch _ {
            print("failed")
        }
        if let loadedAudioAsset = controller.audioAsset {
            let audioTrack = mixComposition.addMutableTrackWithMediaType(AVMediaTypeAudio, preferredTrackID: 0)
            do {
                try audioTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, firstAsset.duration),
                                               ofTrack: loadedAudioAsset.tracksWithMediaType(AVMediaTypeAudio)[0] ,
                                               atTime: kCMTimeZero)
            } catch _ {
                print("Failed to load Audio track")
            }
        }
    }
}

编辑

Apple 声明“通过实现 AVVideoCompositionInstruction 协议的类实例的 NSArray 指示视频合成指令。对于数组中的第一条指令，timeRange.start 必须小于或等于尝试播放或其他处理的最早时间（请注意，这通常是 kCMTimeZero）。对于后续指令，timeRange.start 必须等于前一条指令的结束时间。的结束时间最后一条指令必须大于或等于尝试播放或其他处理的最晚时间（请注意，这通常是与 AVVideoComposition 实例关联的资产的持续时间）。”

这只是说明如果您决定使用任何指令（这就是我的理解），整个组合必须在指令中分层。为什么是这样？在这个例子中，我如何只应用说明来说音轨 2 而根本不应用更改音轨 1 或 3：

曲目 1 从 0 - 10 秒，曲目 2 从 10 - 20 秒，曲目 3 从 20 - 30 秒。

对此的任何解释都可能回答我的问题（如果可行的话）。

Answer 1

是的，您完全可以将单独的变换应用于 AVMutableComposition 的每一层。

这里是这个过程的概述——我在 Objective-C 中亲自完成了这个，所以我不能给你确切的 swift 代码，但我知道这些相同的功能在 Swift.

创建一个 AVMutableComposition。
创建一个 AVMutableVideoComposition。
设置视频合成的渲染大小和帧持续时间。
现在对于每个 AVAsset ：
- 创建一个 AVAssetTrack 和一个 AVAudioTrack。
- 通过将每个添加到 mutableComposition 来为每个（一个用于视频，一个用于音频）创建一个 AVMutableCompositionTrack。

这里变得更复杂了..（抱歉 AVFoundation 并不容易！）

从引用每个视频的 AVAssetTrack 创建一个 AVMutableCompositionLayerInstruction。对于每个 AVMutableCompositionLayerInstruction，您可以在其上设置转换。您还可以执行设置裁剪矩形等操作。
将每个 AVMutableCompositionLayerInstruction 添加到一个层指令数组中。创建所有 AVMutableCompositionLayerInstructions 后，将在 AVMutableVideoComposition 上设置数组。

最后..

最后，您将拥有一个 AVPlayerItem，您将使用它来回放（在 AVPlayer 上）。您使用 AVMutableComposition 创建 AVPlayerItem，然后在 AVPlayerItem 本身上设置 AVMutableVideoComposition (setVideoComposition..)

容易吗？

我花了几个星期才让这些东西运行良好。它完全是无情的，正如你提到的，如果你做错了什么，它不会告诉你你做错了什么 - 它只是没有出现。

但是当你破解它时，它完全可以快速有效地工作。

最后，我概述的所有内容都可以在 AVFoundation 文档中找到。这是一本冗长的巨著，但您需要了解它才能实现您想要做的事情。

祝你好运！

Answer 2

好的，所以对于我的确切问题，我必须在 Swift 中应用特定的转换 CGAffineTransform 以获得我们想要的特定结果。我正在使用的当前 post 可以处理任何图片 taken/obtained 以及视频

//This method gets the orientation of the current transform. This method is used below to determine the orientation
func orientationFromTransform(_ transform: CGAffineTransform) -> (orientation: UIImageOrientation, isPortrait: Bool) {
    var assetOrientation = UIImageOrientation.up
    var isPortrait = false
    if transform.a == 0 && transform.b == 1.0 && transform.c == -1.0 && transform.d == 0 {
        assetOrientation = .right
        isPortrait = true
    } else if transform.a == 0 && transform.b == -1.0 && transform.c == 1.0 && transform.d == 0 {
        assetOrientation = .left
        isPortrait = true
    } else if transform.a == 1.0 && transform.b == 0 && transform.c == 0 && transform.d == 1.0 {
        assetOrientation = .up
    } else if transform.a == -1.0 && transform.b == 0 && transform.c == 0 && transform.d == -1.0 {
        assetOrientation = .down
    }

    //Returns the orientation as a variable
    return (assetOrientation, isPortrait)
}

//Method that lays out the instructions for each track I am editing and does the transformation on each individual track to get it lined up properly
func videoCompositionInstructionForTrack(_ track: AVCompositionTrack, _ asset: AVAsset) -> AVMutableVideoCompositionLayerInstruction {

    //This method Returns set of instructions from the initial track

    //Create inital instruction
    let instruction = AVMutableVideoCompositionLayerInstruction(assetTrack: track)

    //This is whatever asset you are about to apply instructions to.
    let assetTrack = asset.tracks(withMediaType: AVMediaTypeVideo)[0]

    //Get the original transform of the asset
    var transform = assetTrack.preferredTransform

    //Get the orientation of the asset and determine if it is in portrait or landscape - I forget which, but either if you take a picture or get in the camera roll it is ALWAYS determined as landscape at first, I don't recall which one. This method accounts for it.
    let assetInfo = orientationFromTransform(transform)

    //You need a little background to understand this part. 
    /* MyAsset is my original video. I need to combine a lot of other segments, according to the user, into this original video. So I have to make all the other videos fit this size. 
      This is the width and height ratios from the original video divided by the new asset 
    */
    let width = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width/assetTrack.naturalSize.width
    var height = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height

    //If it is in portrait
    if assetInfo.isPortrait {

        //We actually change the height variable to divide by the width of the old asset instead of the height. This is because of the flip since we determined it is portrait and not landscape. 
        height = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.width

        //We apply the transform and scale the image appropriately.
        transform = transform.scaledBy(x: height, y: height)

        //We also have to move the image or video appropriately. Since we scaled it, it could be wayy off on the side, outside the bounds of the viewing.
        let movement = ((1/height)*assetTrack.naturalSize.height)-assetTrack.naturalSize.height

        //This lines it up dead center on the left side of the screen perfectly. Now we want to center it.
        transform = transform.translatedBy(x: 0, y: movement)

        //This calculates how much black there is. Cut it in half and there you go!
        let totalBlackDistance = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width-transform.tx
        transform = transform.translatedBy(x: 0, y: -(totalBlackDistance/2)*(1/height))

    } else {

        //Landscape! We don't need to change the variables, it is all defaulted that way (iOS prefers landscape items), so we scale it appropriately.
        transform = transform.scaledBy(x: width, y: height)

        //This is a little complicated haha. So because it is in landscape, the asset fits the height correctly, for me anyway; It was just extra long. Think of this as a ratio. I forgot exactly how I thought this through, but the end product looked like: Answer = ((Original height/current asset height)*(current asset width))/(Original width)
        let scale:CGFloat = ((MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height)*(assetTrack.naturalSize.width))/MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width
        transform = transform.scaledBy(x: scale, y: 1)

        //The asset can be way off the screen again, so we have to move it back. This time we can have it dead center in the middle, because it wasn't backwards because it wasn't flipped because it was landscape. Again, another long complicated algorithm I derived.
        let movement = ((MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width-((MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height)*(assetTrack.naturalSize.width)))/2)*(1/MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height)
        transform = transform.translatedBy(x: movement, y: 0)
    }

    //This creates the instruction and returns it so we can apply it to each individual track.
    instruction.setTransform(transform, at: kCMTimeZero)
    return instruction
}

现在我们有了这些方法，我们现在可以适当地将正确和适当的转换应用到我们的资产上，让一切都变得干净整洁。

func merge() {
if let firstAsset = MyAsset, let newAsset = newAsset {

        //This creates our overall composition, our new video framework
        let mixComposition = AVMutableComposition()

        //One by one you create tracks (could use loop, but I just had 3 cases)
        let firstTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeVideo,
                                                                     preferredTrackID: Int32(kCMPersistentTrackID_Invalid))

        //You have to use a try, so need a do
        do {

            //Inserting a timerange into a track. I already calculated my time, I call it startTime. This is where you would put your time. The preferredTimeScale doesn't have to be 600000 haha, I was playing with those numbers. It just allows precision. At is not where it begins within this individual track, but where it starts as a whole. As you notice below my At times are different You also need to give it which track 
            try firstTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, CMTime(seconds: CMTimeGetSeconds(startTime), preferredTimescale: 600000)),
                                           of: firstAsset.tracks(withMediaType: AVMediaTypeVideo)[0],
                                           at: kCMTimeZero)
        } catch _ {
            print("Failed to load first track")
        }

        //Create the 2nd track
        let secondTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeVideo,
                                                                      preferredTrackID: Int32(kCMPersistentTrackID_Invalid))

        do {

            //Apply the 2nd timeRange you have. Also apply the correct track you want
            try secondTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, self.endTime-self.startTime),
                                           of: newAsset.tracks(withMediaType: AVMediaTypeVideo)[0],
                                           at: CMTime(seconds: CMTimeGetSeconds(startTime), preferredTimescale: 600000))
            secondTrack.preferredTransform = newAsset.preferredTransform
        } catch _ {
            print("Failed to load second track")
        }

        //We are not sure we are going to use the third track in my case, because they can edit to the end of the original video, causing us not to use a third track. But if we do, it is the same as the others!
        var thirdTrack:AVMutableCompositionTrack!
        if(self.endTime != controller.realDuration) {
            thirdTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeVideo,
                                                                      preferredTrackID: Int32(kCMPersistentTrackID_Invalid))

        //This part appears again, at endTime which is right after the 2nd track is suppose to end.
            do {
                try thirdTrack.insertTimeRange(CMTimeRangeMake(CMTime(seconds: CMTimeGetSeconds(endTime), preferredTimescale: 600000), self.controller.realDuration-endTime),
                                           of: firstAsset.tracks(withMediaType: AVMediaTypeVideo)[0] ,
                                           at: CMTime(seconds: CMTimeGetSeconds(endTime), preferredTimescale: 600000))
            } catch _ {
                print("failed")
            }
        }

        //Same thing with audio!
        if let loadedAudioAsset = controller.audioAsset {
            let audioTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeAudio, preferredTrackID: 0)
            do {
                try audioTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, self.controller.realDuration),
                                               of: loadedAudioAsset.tracks(withMediaType: AVMediaTypeAudio)[0] ,
                                               at: kCMTimeZero)
            } catch _ {
                print("Failed to load Audio track")
            }
        }

        //So, now that we have all of these tracks we need to apply those instructions! If we don't, then they could be different sizes. Say my newAsset is 720x1080 and MyAsset is 1440x900 (These are just examples haha), then it would look a tad funky and possibly not show our new asset at all.
        let mainInstruction = AVMutableVideoCompositionInstruction()

        //Make sure the overall time range matches that of the individual tracks, if not, it could cause errors. 
        mainInstruction.timeRange = CMTimeRangeMake(kCMTimeZero, self.controller.realDuration)

        //For each track we made, we need an instruction. Could set loop or do individually as such.
        let firstInstruction = videoCompositionInstructionForTrack(firstTrack, firstAsset)
        //You know, not 100% why this is here. This is 1 thing I did not look into well enough or understand enough to describe to you. 
        firstInstruction.setOpacity(0.0, at: startTime)

        //Next Instruction
        let secondInstruction = videoCompositionInstructionForTrack(secondTrack, self.asset)

        //Again, not sure we need 3rd one, but if we do.
        var thirdInstruction:AVMutableVideoCompositionLayerInstruction!
        if(self.endTime != self.controller.realDuration) {
            secondInstruction.setOpacity(0.0, at: endTime)
            thirdInstruction = videoCompositionInstructionForTrack(thirdTrack, firstAsset)
        }

        //Okay, now that we have all these instructions, we tie them into the main instruction we created above.
        mainInstruction.layerInstructions = [firstInstruction, secondInstruction]
        if(self.endTime != self.controller.realDuration) {
            mainInstruction.layerInstructions += [thirdInstruction]
        }

        //We create a video framework now, slightly different than the one above.
        let mainComposition = AVMutableVideoComposition()

        //We apply these instructions to the framework
        mainComposition.instructions = [mainInstruction]

        //How long are our frames, you can change this as necessary
        mainComposition.frameDuration = CMTimeMake(1, 30)

        //This is your render size of the video. 720p, 1080p etc. You set it!
        mainComposition.renderSize = firstAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize

        //We create an export session (you can't use PresetPassthrough because we are manipulating the transforms of the videos and the quality, so I just set it to highest)
        guard let exporter = AVAssetExportSession(asset: mixComposition, presetName: AVAssetExportPresetHighestQuality) else { return }

        //Provide type of file, provide the url location you want exported to (I don't have mine posted in this example).
        exporter.outputFileType = AVFileTypeMPEG4
        exporter.outputURL = url

        //Then we tell the exporter to export the video according to our video framework, and it does the work!
        exporter.videoComposition = mainComposition

        //Asynchronous methods FTW!
        exporter.exportAsynchronously(completionHandler: {
            //Do whatever when it finishes!
        })
    }
}

这里有很多事情要做，但无论如何都必须完成，以我的例子为例！很抱歉 post 花了这么长时间，如果您有任何问题，请告诉我。

仅播放 AVMutableComposition() 的第一首曲目

Only First Track Playing of AVMutableComposition()

avmutablecomposition

avasset

cgsize

swift