
Merge video, audio, images into one video
🤖/video/merge composes a new video by adding an audio track to existing still image(s) or video.
This Robot is able to generate a video from:
- An image and an audio file
- A video and an audio file
- Several images
Merging an audio file and an image
To merge an audio file and an image to create a video please pass both the audio file and the image to an Assembly Step via the use
parameter. For this to work, you just need to use the as-syntax:
"merged": {
"robot": "/video/merge",
"preset": "ipad-high",
"use": {
"steps": [
{ "name": ":original", "as": "audio" },
{ "name": ":original", "as": "image" }
],
"bundle_steps": true
}
}
Imagine you have uploaded both an image and an audio file in the same upload form. In the above example the system will automatically recognize the files properly if you use the same Step name twice (":original"
in this case). Instead of using :original
you could also use any other valid Assembly Step name.
If you are using multiple file input fields, then you can tell Transloadit which field supplies the audio file and which supplies the image. Suppose you have two file input fields named the_image
and the_audio
. These Assembly Instructions will make it work:
"merged": {
"robot": "/video/merge",
"preset": "ipad-high",
"use": {
"steps": [
{ "name": ":original", "fields": "the_audio", "as": "audio" },
{ "name": ":original", "fields": "the_image", "as": "image" }
],
"bundle_steps": true
}
}
Merging an audio file and a video
If you have a video file (without sound for example) and an audio track that you want the video to use you can merge them together with Transloadit. Just label the video as video
and the audio file as audio
using the as
key in the JSON.
Imagine you have two file input fields in the same upload form - one to upload a video file and one for an audio file. You can tell Transloadit which field supplies the video and which the audio file using the file input field's name
attribute. Just use the value for the name attribute as the value for the fields
key in the JSON:
"merged": {
"robot": "/video/merge",
"preset": "ipad-high",
"use": {
"steps": [
{ "name": ":original", "fields": "the_video", "as": "video" },
{ "name": ":original", "fields": "the_audio", "as": "audio" }
],
"bundle_steps": true
}
}
You can also supply the video and audio file using other Assembly Steps of course and leave out the fields
attribute.
Warning: When merging audio and video files, it's recommended to set target a format & codecs via a preset
or via ffmpeg.codec:v
, ffmpeg.codec:a
and ffmpeg.f
. If not, merging will default to backwards compatible, but non-desirable legacy codecs.
Merging several images to generate a video
It is possible to create a video from images with Transloadit. Just label all images as image
using the as
key in the JSON:
"merged": {
"robot": "/video/merge",
"preset": "ipad-high",
"use": {
"steps": [
{ "name": ":original", "as": "image" }
],
"bundle_steps": true
},
"framerate": "1/10",
"duration": 8.5
}
This will work fine in a multi-file upload context. Files are sorted by their basename. So if you name them 01.jpeg
and 02.jpeg
, they will be merged in the correct order.
You can also supply your images using other Assembly Steps of course, results from 🤖/image/resize Steps for example.
Parameters
-
use
String / Array of Strings / ObjectrequiredSpecifies which Step(s) to use as input.
-
You can pick any names for Steps except
":original"
(reserved for user uploads handled by Transloadit) -
You can provide several Steps as input with arrays:
"use": [ ":original", "encoded", "resized" ]
💡 That’s likely all you need to know about
use
, but you can view advanced use cases:› Advanced use cases
-
Step bundling. Some Robots can gather several Step results for a single invocation. For example, 🤖/file/compress would normally create one archive for each file passed to it. If you'd set
bundle_steps
to true, however, it will create one archive containing all the result files from all Steps you give it. To enable bundling, provide an object like the one below to theuse
parameter:"use": { "steps": [ ":original", "encoded", "resized" ], "bundle_steps": true }
This is also a crucial parameter for 🤖/video/adaptive, otherwise you'll generate 1 playlist for each viewing quality.
Keep in mind that all input Steps must be present in your Template. If one of them is missing (for instance it is rejected by a filter), no result is generated because the Robot waits indefinitely for all input Steps to be finished.Here’s a demo that showcases Step bundling.
-
Group by original. Sticking with 🤖/file/compress example, you can set
group_by_original
totrue
, in order to create a separate archive for each of your uploaded or imported files, instead of creating one archive containing all originals (or one per resulting file). This is important for for 🤖/media/playlist where you'd typically set:"use": { "steps": [ "segmented" ], "bundle_steps": true, "group_by_original": true }
-
Fields. You can be more discriminatory by only using files that match a field name by setting the
fields
property. When this array is specified, the corresponding Step will only be executed for files submitted through one of the given field names, which correspond with the strings in thename
attribute of the HTML file input field tag for instance. When using a back-end SDK, it corresponds withmyFieldName1
in e.g.:$transloadit->addFile('myFieldName1', './chameleon.jpg')
.This parameter is set to
true
by default, meaning all fields are accepted.Example:
"use": { "steps": [ ":original" ], "fields": [ "myFieldName1" ] }
-
Use as. Sometimes Robots take several inputs. For instance, 🤖/video/merge can create a slideshow from audio and images. You can map different Steps to the appropriate inputs.
Example:
"use": { "steps": [ { "name": "audio_encoded", "as": "audio" }, { "name": "images_resized", "as": "image" } ] }
Sometimes the ordering is important, for instance, with our concat Robots. In these cases, you can add an index that starts at 1. You can also optionally filter by the multipart field name. Like in this example, where all files are coming from the same source (end-user uploads), but with different
<input>
names:Example:
"use": { "steps": [ { "name": ":original", "fields": "myFirstVideo", "as": "video_1" }, { "name": ":original", "fields": "mySecondVideo", "as": "video_2" }, { "name": ":original", "fields": "myThirdVideo", "as": "video_3" } ] }
For times when it is not apparent where we should put the file, you can use Assembly Variables to be specific. For instance, you may want to pass a text file to 🤖/image/resize to burn the text in an image, but you are burning multiple texts, so where do we put the text file? We specify it via
${use.text_1}
, to indicate the first text file that was passed.Example:
"watermarked": { "robot": "/image/resize", "use" : { "steps": [ { "name": "resized", "as": "base" }, { "name": "transcribed", "as": "text" }, ], }, "text": [ { "text" : "Hi there", "valign": "top", "align" : "left", }, { "text" : "From the 'transcribed' Step: ${use.text_1}", "valign" : "bottom", "align" : "right", "x_offset": 16, "y_offset": -10, } ] }
-
-
output_meta
Object / Boolean ⋅ default:{}
Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.
For images, you can add
"has_transparency": true
in this object to extract if the image contains transparent parts and"dominant_colors": true
to extract an array of hexadecimal color codes from the image.For videos, you can add the
"colorspace: true"
parameter to extract the colorspace of the output video.For audio, you can add
"mean_volume": true
to get a single value representing the mean average volume of the audio file.You can also set this to
false
to skip metadata extraction and speed up transcoding. -
preset
String ⋅ default:"flash"
Generates the video according to pre-configured video presets.
If you specify your own FFmpeg parameters using the Robot's
ffmpeg
parameter and you have not specified a preset, then the default"flash"
preset is not applied. This is to prevent you from having to override each of the flash preset's values manually. -
width
Integer(1
-1920
) ⋅ default: Width of the input videoWidth of the new video, in pixels.
If the value is not specified and the
preset
parameter is available, thepreset
's supplied width will be implemented. -
height
Integer(1
-1080
) ⋅ default: Height of the input videoHeight of the new video, in pixels.
If the value is not specified and the
preset
parameter is available, thepreset
's supplied height will be implemented. -
resize_strategy
String ⋅ default:"pad"
If the given width/height parameters are bigger than the input image's dimensions, then the
resize_strategy
determines how the image will be resized to match the provided width/height. See the available resize strategies. -
background
String ⋅ default:"00000000"
The background color of the resulting video the
"rrggbbaa"
format (red, green, blue, alpha) when used with the"pad"
resize strategy. The default color is black. -
framerate
String ⋅ default:"1/5"
When merging images to generate a video this is the input framerate. A value of "1/5" means each image is given 5 seconds before the next frame appears (the inverse of a framerate of "5"). Likewise for "1/10", "1/20", etc. A value of "5" means there are 5 frames per second.
-
image_durations
Array of Floats ⋅ default: []When merging images to generate a video this allows you to define how long (in seconds) each image will be shown inside of the video. So if you pass 3 images and define
[2.4, 5.6, 9]
the first image will be shown for 2.4s, the second image for 5.6s and the last one for 9s. Theduration
parameter will automatically be set to the sum of the image_durations, so17
in our example. It can still be overwritten, though, in which case the last image will be shown until the defined duration is reached. -
duration
Float ⋅ default:5.0
When merging images to generate a video or when merging audio and video this is the desired target duration in seconds. The float value can take one decimal digit. If you want all images to be displayed exactly once, then you can set the duration according to this formula:
duration = numberOfImages / framerate
. This also works for the inverse framerate values like1/5
.If you set this value to
null
(default), then the duration of the input audio file will be used when merging images with an audio file.When merging audio files and video files, the duration of the longest video or audio file is used by default.
-
audio_delay
Float ⋅ default:0.0
When merging a video and an audio file, and when merging images and an audio file to generate a video, this is the desired delay in seconds for the audio file to start playing. Imagine you merge a video file without sound and an audio file, but you wish the audio to start playing after 5 seconds and not immediately, then this is the parameter to use.
FFmpeg parameters
-
ffmpeg_stack
String ⋅ default:"v3.3.3"
Selects the FFmpeg stack version to use for encoding. These versions reflect real FFmpeg versions.
The current recommendation is to use
"v4.3.1"
. Other valid values can be found here. -
ffmpeg
Object ⋅ default:{}
A parameter object to be passed to FFmpeg. For available options, see the FFmpeg documentation. If a preset is used, the options specified are merged on top of the ones from the preset.
Demos
- Add an audio track to video footage
- Merge audio into video at a specific time
- Take a scrolling screenshot of a website automatically (by using a URL)
- Encode a zooming effect onto an image
Related blog posts
- A Happy 2014 from Transloadit! January 14, 2014
- Merging Image and Audio Files to Create Videos August 7, 2013
- On Upgrades & Goodbyes August 8, 2014
- Kicking Transloadit Into Gear for the New Year February 1, 2015
- Upgrading Encoding Engines July 31, 2015
- Happy 2016 from Transloadit December 31, 2015
- Raising prices (for new customers) February 7, 2018
- Fine-tuning your video: the audio delay parameter March 12, 2019
- Tutorial: Using /video/merge to develop video slideshows June 14, 2019
- Add real-time video uploading to a site without writing code, with Bubble.is and Transloadit August 2, 2019
- Let's Build: a video from album art October 10, 2021
- Automatically generate music previews from Spotify November 16, 2021
- Let's Build: Reddit video subtitling bot February 10, 2022
- Let's Build: Music Card Generator May 5, 2022