Microsoft launches “Florence” AI computer vision model, in preview

Priya Walia

Microsoft Inspire 2023

Microsoft has launched its Florence AI computer vision model, which is now available in public preview. The model has been trained using billions of text-image pairs and integrated with Azure Cognitive Service for Vision, making it possible to create reliable, cost-effective, and market-ready vision applications for various sectors at an optimal cost. The Vision service also enables developers to create state-of-the-art, responsible computer vision applications.

Microsoft customers can now effortlessly digitize, evaluate and link their data to conversational language interaction while gaining valuable insights from image and video content. This helps to provide accessible content for users, optimize SEO for higher acquisition, protect users from any potentially harmful content, bolster security, and allow for faster incident response times.

Florence offers user-friendly content discovery with automated captioning, intelligent cropping, categorization, removal of background distractions, and image searches. What’s more, users can track movement, assess their surroundings, and obtain instant notifications with ethically sound AI implementation.

Vision Studio’s new features provide consumers with an out-of-the-box experience. These include Dense Captions which automate the process of producing extensive captions, design suggestions, accessible alt-text, SEO optimization, and intelligent photo curation for digital content. Furthermore, the Image Retrieval technique allows for improved search recommendations and advertisements through natural language queries that accurately measure the similarity between images and text.

Reddit has announced the integration of Vision Services to generate captions of hundreds of millions of images on its platform; a move welcomed by Tiffany Ong, Product Manager of Consumer Products at the company.

Microsoft is leveraging Vision Services in its Microsoft 365 suite of applications, including Teams, PowerPoint, Outlook, Word, Designer, and OneDrive, along with Microsoft Datacenters.

Teams is innovating the digital experience with segmentation capabilities that enable more efficient virtual meetings. PowerPoint, Outlook, and Word have enabled image captioning for automatic alt-text to make their applications more accessible. Designer and OneDrive have improved image tagging, search, and background generation which improves the discoverability and editing of images. Vision Services are also being used to bolster security and reliability at Microsoft Datacenters.