Analyze speech for text XMP metadata

Adobe Premiere Pro and Soundbooth analyze spoken words and generate text metadata. You can edit and search text metadata like any other metadata properties. You can then navigate to the times at which specific words are spoken, to better align edits, advertising, and subtitles.

For more information, see the video tutorial Using Speech Search to Speed Editing.

Note: Useful results from speech analysis require good audio quality. Background noise significantly reduces accuracy. To remove such noise, use the tools and processes in Soundbooth.

Analyze speech to create text metadata

  1. Select a file or clip.

  2. At the bottom of the Metadata panel, click Analyze Speech, or Analyze (Adobe Premiere Pro).

  3. Set the Language and Quality options, and select Identify Speakers if you want to create separate speech metadata for each person.
    Note: Speech Search can use any of several language-specific and dialect-specific libraries, such as libraries for Spanish and UK English.
  4. Click OK.

    The spoken words appear in the Speech Analysis section.

  5. To retain the speech metadata, save the project.

If you import files with a speech metadata into After Effects, each word appears as a layer marker on layers based on these footage items.

Navigate to a specific word in speech metadata

  1. In the Speech Analysis section, select the word.

    Timecode In and Duration indicate the precise location and length of your selection.

  2. To hear the selection, click either Play or Loop Playback. (The latter option repeatedly plays the selected word, with some preroll and postroll.)

Edit speech metadata

 In the Speech Analysis section, do any of the following:
  • To correct a word, click it, and type.

  • To insert, delete, merge, cut, or copy words, right-click an existing word, and choose a command from the context menu.

Copy text from speech metadata to the clipboard for use in a text editor

 Right-click the transcript, and choose Copy All.

Improve speech analysis with reference scripts

Accuracy of the speech to text conversion depends on the clarity of the spoken words and the quality of the recorded dialog. Dialog recorded in a noisy environment or with poor microphone placement cannot produce highly accurate results even with a reference script. You can nevertheless use a reference script to improve speech analysis. A reference script is a text document containing dialog similar to the dialog recorded in your assets.

There are two types of reference scripts:
  • A script that contains similar dialog, but was not necessarily written for the current project. For example, a series of medical training scripts for different products can be combined into a single text document. You can use this text document as a reference script. With this type of reference script, speech analysis produces results more accurate than it does when using only the default language models.

  • A script that matches the recorded dialog. This type of reference script provides the highest accuracy possible. For example, you can use the script that the talent read during the shoot as a reference script. Alternatively, you could use a transcript typed from the assets for the purposes of close captioning.

Speech Analysis supports reference scripts only in the UTF-8 encoded text format, including Adobe Story scripts, which have the .astx filename extension.

Note: To make reference scripts available in Soundbooth, first complete the steps below in Adobe Premiere Pro.
  1. From the Reference Script menu in the Analyze Content dialog box, choose Add.

  2. In the displayed dialog box, browse to the reference script text or .astx file, select it, and click OK.

  3. In the Import Script dialog box, type a name for the reference script, and select the language of the script.

    Note: You can view the text of the file in a scrolling window.
  4. Select Script Text Matches Dialog only if the imported script covers the recorded dialog verbatim. For example, if the reference script is the script from which the talent read their lines, select Script Text Matches Dialog. Select this option even if the recorded dialog is shorter than what the script file covers.

  5. Click OK.

    The Import Script dialog closes, and the reference script is selected in the Reference Script menu.

  6. Click OK.

Improve speech analysis with Adobe Story, OnLocation, and Adobe Premiere Pro

You can use Adobe Story, OnLocation, and Adobe Premiere Pro to create the most accurate speech analysis. Import a script written in Adobe Story into OnLocation. OnLocation produces a list of shot placeholders for each scene. Either record these shots using OnLocation during production, or link the placeholder shots to their respective video files when you import the video files into OnLocation. In either case, OnLocation embeds the text for each shot from the original script into the metadata of the shot.

When you import the clips into  Adobe Premiere Pro, it automatically uses the Adobe Story script as a reference script. When Adobe Premiere Pro finds enough matches with the embedded script, Adobe Premiere Pro replaces the analyzed speech text with the embedded script text. Adobe Premiere Pro carries over correct spelling, proper names, and punctuation from the reference script, benefits that standard speech analysis cannot provide.

The closeness of the match between the embedded script text and the recorded dialog determines the accuracy of matched-script text. If 100% accuracy is important, edit and revise the script text first. Ensure that the script matches the recorded dialog before using it as a reference script.