The latest advancements from Google in generative AI are transforming how we approach video creation. With the release of Veo 3.1, a groundbreaking feature has emerged: the ability to interpolate videos using only their initial and final frames. When combined with the powerful image generation capabilities of the Gemini 2.5 Flash Image model, affectionately known as ‘Nano Banana’, developers can now craft compelling short videos from a sequence of static images. This innovative interpolation is exclusively powered by the lastFrame parameter within the Veo 3.1 model. This article explores a practical implementation, demonstrating how the Gemini 2.5 Flash Image model can generate a visual story through a series of images, which Veo 3.1 then uses to produce dynamic video content.

Configuring Veo Model Usage

To ensure that this cutting-edge feature is utilized only when the appropriate Veo 3.1 model is active, a robust configuration mechanism is essential. This implementation introduces an environment variable, IS_VEO31_USED, set to “true” to signal the availability of the latest Veo model. This flag is then seamlessly injected into the application, governing whether the Gemini API receives the necessary arguments for video generation, specifically enabling the lastFrame parameter.

The configuration is managed through a firebase-ai.config file:

// firebase-ai.config
{
  // ... other configuration values
  "is_veo31_used": true
}

This value is then consumed within the Angular application using an InjectionToken to provide the isVeo31Used value globally.

// gemini provider

import firebaseConfig from '../../firebase-ai.json';
import { GEMINI_AI, IS_VEO31_USED } from '../constants/gemini.constant';

export const IS_VEO31_USED = new InjectionToken<boolean>('IS_VEO31_USED');

export function provideGemini() {
  return makeEnvironmentProviders([
    {
      provide: IS_VEO31_USED,
      useValue: firebaseConfig.is_veo31_used,
    }
  ])
}

This pattern ensures that the application accurately identifies which Veo model it should interact with, maintaining backward compatibility and feature-specific activation. The provideGemini function is integrated into the application’s main configuration:

export const appConfig: ApplicationConfig = {
  providers: [
    provideGemini(),
  ]
};

This guarantees that the isVeo31Used value is available throughout the application lifecycle.

Crafting Visual Stories with Nano Banana

Defining Step Prompts in the Visual Story Service

At the heart of generating a visual story lies the VisualStoryService. This service dynamically constructs a sequence of prompts, each tailored to generate a specific image in a multi-step narrative. It takes user input and generation arguments (like the number of images, style, and transition) to build descriptive prompts for the Gemini 2.5 Flash Image model.

// Visual Story Service

@Injectable({
  providedIn: 'root'
})
export class VisualStoryService {
  buildStepPrompts(genArgs: VisualStoryGenerateArgs): string[] {
    const { userPrompt, args } = genArgs;
    const currentPrompt = userPrompt.trim();

    if (!currentPrompt) {
      return [];
    }

    const stepPrompts: string[] = [];

    for (let i = 0; i < args.numberOfImages; i++) {
      const storyPrompt = this.buildStoryPrompt({ userPrompt: currentPrompt, args }, i + 1);
      stepPrompts.push(storyPrompt);
    }

    return stepPrompts;
  }

  private buildStoryPrompt(genArgs: VisualStoryGenerateArgs, stepNumber: number): string {
    const { userPrompt, args } = genArgs;
    const { numberOfImages, style, transition, type } = args;
    let fullPrompt = `${userPrompt}, step ${stepNumber} of ${numberOfImages}`;

    // Add context based on type
    switch (type) {
      case 'story':
        fullPrompt += `, narrative sequence, ${style} art style`;
        break;
      case 'process':
        fullPrompt += `, procedural step, instructional illustration`;
        break;
      // ... other type of visual story
    }

    if (stepNumber > 1) {
      fullPrompt += `, ${transition} transition from previous step`;
    }

    return fullPrompt;
  }
}

For example, a user prompt like “A wizard making a potion” for a 3-step story in a cinematic art style would yield prompts such as:

// Example: userPrompt="A wizard making a potion", args.numberOfImages=3, args.type='story'

Step 1 Prompt: "A wizard making a potion, step 1 of 3, narrative sequence, cinematic art style"
Step 2 Prompt: "A wizard making a potion, step 2 of 3, narrative sequence, cinematic art style, liquid dissolving transition from previous step"
Step 3 Prompt: "A wizard making a potion, step 3 of 3, narrative sequence, cinematic art style, liquid dissolving transition from previous step"

These detailed prompts guide Nano Banana to generate a cohesive visual narrative.

Video Interpolation with Veo 3.1

The GenMediaService handles the crucial task of video generation, particularly leveraging Veo 3.1’s interpolation capabilities. The generateVideoFromFrames method orchestrates the call to the Gemini API, providing the first and last images from the generated sequence. The presence of the lastFrame parameter within the configuration object is what activates Veo 3.1’s unique interpolation feature.

@Injectable({
  providedIn: 'root'
})
export class GenMediaService {
  private readonly geminiService = inject(GeminiService);

  async generateVideoFromFrames(imageParams: GenerateVideoFromFramesRequest) {
    const isVeo31Used = imageParams.isVeo31Used || false;

    const loadVideoPromise = isVeo31Used ?
       this.geminiService.generateVideo({
          prompt: imageParams.prompt,
          imageBytes: imageParams.imageBytes,
          mimeType: imageParams.mimeType,
          config: {
            aspectRatio: '16:9',
            resolution: "720p",
            lastFrame: {
              imageBytes: imageParams.lastFrameImageBytes,
              mimeType: imageParams.lastFrameMimeType
            }
          }
       }) : this.getFallbackVideoUrl(imageParams);

      return await loadVideoPromise;
    }
  }

  private async getFallbackVideoUrl(imageParams: GenerateVideoRequestImageParams) {
    return this.geminiService.generateVideo({
      prompt: imageParams.prompt,
      imageBytes: imageParams.imageBytes,
      mimeType: imageParams.mimeType,
      config: {
        aspectRatio: '16:9',
      }
    });
  }
}

When isVeo31Used is false, a fallback mechanism is engaged via getFallbackVideoUrl. This method generates a video using only the first image and a specified aspect ratio, omitting the resolution property for compatibility with older Veo models that may not support it. The VisualStoryService acts as an intermediary, delegating the video generation request:

// Visual Story Service

interpolateVideo(request: GenerateVideoFromFramesRequest): Promise<VideoResponse> {
    return this.genMediaService.generateVideoFromFrames(request);
}

The Video Interpolation Component

An Angular component, app-visual-story-video, serves as the user interface for initiating and displaying the interpolated video.

<app-visual-story-video
    [userPrompt]="this.promptArgs().userPrompt"
    [images]="this.genmedia()?.images()"
/>

This component is designed to conditionally enable the video interpolation feature.

@Component({
  selector: 'app-visual-story-video',
  imports: [...import components...],
  template: `
    @if (canGenerateVideoFromFirstLastFrames()) {
      <button type="button (click)="generateVideoFromFrames()">
          Interpolate video
      </button>

      @let videoUrl = videoResponse()?.videoUrl;
      @if (isLoading()) {
        <app-loader />
      } @else if (videoUrl) {
        <app-video-player class="block" [videoUrl]="videoUrl" />
      }
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export default class VisualStoryVideoComponent {
  private readonly visualStoryService = inject(VisualStoryService);
  private readonly isVeo31Used = inject(IS_VEO31_USED);

  images = input<ImageResponse[] | undefined>(undefined);
  userPrompt = input.required<string>();

  isLoading = signal(false);
  videoResponse = signal<VideoResponse | undefined>(undefined);

  firstImage = computed(() => this.images()?.[0]);
  lastImage = computed(() => {
    const numImages = this.images()?.length || 0;
    return numImages < 2 ? undefined : this.images()?.[numImages - 1];
  });

  canGenerateVideoFromFirstLastFrames = computed(() => {
    const hasFirstImage = !!this.firstImage()?.data && !!this.firstImage()?.mimeType;
    const hasLastImage = !!this.lastImage()?.data && !!this.lastImage()?.mimeType;
    return this.isVeo31Used && hasFirstImage && hasLastImage;
  });

  async generateVideoFromFrames(): Promise<void> {
    try {
      this.isLoading.set(true);
      this.videoResponse.set(undefined);

      if (!this.canGenerateVideoFromFirstLastFrames()) {
        return;
      }

      const { data: firstImageData, mimeType: firstImageMimeType } = this.firstImage() || { data: '', mimeType: '' };
      const { data: lastImageData, mimeType: lastImageMimeType } = this.lastImage() || { data: '', mimeType: '' };
      const result = await this.visualStoryService.interpolateVideo({
        prompt: this.userPrompt(),
        imageBytes: firstImageData,
        mimeType: firstImageMimeType,
        lastFrameImageBytes: lastImageData,
        lastFrameMimeType: lastImageMimeType,
        isVeo31Used: this.isVeo31Used
      });
      this.videoResponse.set(result);
    } finally {
      this.isLoading.set(false);
    }
  }
}

The canGenerateVideoFromFirstLastFrames computed signal determines when the “Interpolate video” button should be visible, requiring both the first and last images to be present and the isVeo31Used flag to be true. Upon successful video generation, the videoResponse signal is updated, triggering the app-video-player component to display the newly created dynamic content.

Resources

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed