GitHub - google-research/frame-interpolation: FILM: Frame Interpolation for Larg...
source link: https://github.com/google-research/frame-interpolation
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
FILM: Frame Interpolation for Large Scene Motion
Project | Paper | YouTube | Benchmark Scores
Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn't use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone.
FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
Google Research
Technical Report 2022.
FILM transforms near-duplicate photos into a slow motion footage that look like it is shot with a video camera.
Web Demo
Try the interpolation model with the replicate web demo at
Installation
- Get Frame Interpolation source codes
> git clone https://github.com/google-research/frame-interpolation frame_interpolation
- Optionally, pull the recommended Docker base image
> docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-6:latest
- Install dependencies
> pip3 install -r frame_interpolation/requirements.txt
> apt-get install ffmpeg
Pre-trained Models
- Create a directory where you can keep large files. Ideally, not in this directory.
> mkdir <pretrained_models>
- Download pre-trained TF2 Saved Models from
google drive
and put into
<pretrained_models>
.
The downloaded folder should have the following structure:
pretrained_models/
├── film_net/
│ ├── L1/
│ ├── VGG/
│ ├── Style/
├── vgg/
│ ├── imagenet-vgg-verydeep-19.mat
Running the Codes
The following instructions run the interpolator on the photos provided in frame_interpolation/photos.
One mid-frame interpolation
To generate an intermediate photo from the input near-duplicate photos, simply run:
> python3 -m frame_interpolation.eval.interpolator_test \
--frame1 frame_interpolation/photos/one.png \
--frame2 frame_interpolation/photos/two.png \
--model_path <pretrained_models>/film_net/Style/saved_model \
--output_frame frame_interpolation/photos/middle.png \
This will produce the sub-frame at t=0.5
and save as
'frame_interpolation/photos/middle.png'.
Many in-between frames interpolation
Takes in a set of directories identified by a glob (--pattern). Each directory is expected to contain at least two input frames, with each contiguous frame pair treated as an input to generate in-between frames.
> python3 -m frame_interpolation.eval.interpolator_cli \
--pattern "frame_interpolation/photos" \
--model_path <pretrained_models>/film_net/Style/saved_model \
--times_to_interpolate 6 \
--output_video
You will find the interpolated frames (including the input frames) in 'frame_interpolation/photos/interpolated_frames/', and the interpolated video at 'frame_interpolation/photos/interpolated.mp4'.
The number of frames is determined by --times_to_interpolate
, which controls
the number of times the frame interpolator is invoked. When the number of frames
in a directory is 2, the number of output frames will be
2^times_to_interpolate+1
.
Datasets
We use Vimeo-90K as our main training dataset. For quantitative evaluations, we rely on commonly used benchmark datasets, specifically:
Creating a TFRecord
The training and benchmark evaluation scripts expect the frame triplets in the TFRecord storage format.
We have included scripts that encode the relevant frame triplets into a tf.train.Example data format, and export to a TFRecord file.
You can use the commands python3 -m frame_interpolation.datasets.create_<dataset_name>_tfrecord --help
for more information.
For example, run the command below to create a TFRecord for the Middlebury-other
dataset. Download the images and point --input_dir
to the unzipped folder path.
> python3 -m frame_interpolation.datasets.create_middlebury_tfrecord \
--input_dir=<root folder of middlebury-other> \
--output_tfrecord_filepath=<output tfrecord filepath> \
--num_shards=3
The above command will output a TFRecord file with 3 shards as <output tfrecord filepath>@3
.
Training
Below are our training gin configuration files for the different loss function:
frame_interpolation/training/
├── config/
│ ├── film_net-L1.gin
│ ├── film_net-VGG.gin
│ ├── film_net-Style.gin
To launch a training, simply pass the configuration filepath to the desired
experiment.
By default, it uses all visible GPUs for training. To debug or train
on a CPU, append --mode cpu
.
> python3 -m frame_interpolation.training.train \
--gin_config frame_interpolation/training/config/<config filename>.gin \
--base_folder <base folder for all training runs> \
--label <descriptive label for the run>
- When training finishes, the folder structure will look like this:
<base_folder>/
├── <label>/
│ ├── config.gin
│ ├── eval/
│ ├── train/
│ ├── saved_model/
Build a SavedModel
Optionally, to build a SavedModel format from a trained checkpoints folder, you can use this command:
> python3 -m frame_interpolation.training.build_saved_model_cli \
--base_folder <base folder of training sessions> \
--label <the name of the run>
- By default, a SavedModel is created when the training loop ends, and it will be saved at
<base_folder>/<label>/<saved_model>
.
Evaluation on Benchmarks
Below, we provided the evaluation gin configuration files for the benchmarks we have considered:
frame_interpolation/eval/
├── config/
│ ├── middlebury.gin
│ ├── ucf101.gin
│ ├── vimeo_90K.gin
│ ├── xiph_2K.gin
│ ├── xiph_4K.gin
To run an evaluation, simply pass the configuration file of the desired evaluation dataset.
If a GPU is visible, it runs on it.
> python3 -m frame_interpolation.eval.eval_cli \
--gin_config frame_interpolation/eval/config/<eval_dataset>.gin \
--model_path <pretrained_models>/film_net/L1/saved_model
The above command will produce the PSNR and SSIM scores presented in the paper.
Citation
If you find this implementation useful in your works, please acknowledge it appropriately by citing:
@inproceedings{reda2022film,
title = {Frame Interpolation for Large Motion},
author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
booktitle = {arXiv},
year = {2022}
}
@misc{film-tf,
title = {Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Scene Motion"},
author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/google-research/frame-interpolation}}
}
Contact: Fitsum Reda ([email protected])
Acknowledgments
We would like to thank Richard Tucker, Jason Lai and David Minnen. We would also like to thank Jamie Aspinall for the imagery included in this repository.
Coding style
- 2 spaces for indentation
- 80 character line length
- PEP8 formatting
Disclaimer
This is not an officially supported Google product.
Recommend
-
21
之前写了一片剖析纹理映射的文章,里面简单的介绍了一下纹理映射中常用的仿射插值(Affine),处理一些特殊情况下的投影插值(Projective),作为抛砖引玉的见解之文,收到了不少好评,谢谢大家的
-
52
Kotlin’s string interpolation is an amazing feature. Combined with autocompletion support by the Intellij IDE, writing interpolated strings is as easy as writing string literals. When experimenting…
-
80
README.md Adversarially Constrained Autoencoder Interpolations (ACAI) Code for the paper "Understanding and I...
-
36
熟悉GLES的同学都属性纹理映射的规则,可能也都遇到过一些纹理映射中的小问题,今天就简单的说一下Projective Texture Mapping(投影映射纹理,有人翻译为透视映射纹理,不要在意这些细节,本文按照 投影映射纹理...
-
32
README.md Chalk Terminal colors using Swift 5’s string interpolation extensions. print("Hi my name is
-
241
README.md DAIN (Depth-Aware Video Frame Interpolation) Project |
-
8
TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper TimeLens: Event-based...
-
3
present present is a tool that lets you interpolate the standard output of arbitrary scripts that get interpreted by the shell into your markdown documents. Its aim is to provide a nice way to automatically update...
-
5
Abstract We present a frame interpolation algorithm that synthesizes multiple intermediate frames from two input images with large in-between motion. Recent methods use multiple networks to esti...
-
2
AMD previews FSR 3 with frame interpolation, an Nvidia DLSS 3 frame generation rival...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK