Using new advancements in connecting text and images, we helped a large media production company implement effective video search without the need for meta-data.
Our customers is a large media production company with a platform for creating and storing content. There are previous functionality to search for content, but they all required tags and meta-data. With the invention of CLIP, we wanted to implement a scalable video-search functionality that can search trough thousands of videos quickly.
The most challenging aspect of this project is the amount of data. Edisen is an established media production company that store an enormous amount of videos in their system. To process everything with a complex algorithm like CLIP requires a lot of computation. Also, searching works using vector multiplication and that demands memory.
We delivered two microservices that were easy to deploy in the current platform. One microservice for the algorithm and post-processing, and one search engine. To deal with the challenge of scale, we implemented dynamic sampling where we only store results for relevant frames and an approximate nearest neighbor search.
The goal of this project was to make accesible satellite data easier to work with by removing unecessary objects and by increasing the resolution.
Our assignment was to develop an algorithm that segments satellite images into distinct areas based on vegetation without any labeled training data.
We helped our customer to improve their solution for reading text in images and went from 4% error rate to a much better 0.2%.