Investor POV

Primed for reinvention: Video intelligence industry moves to the cloud

Arjun Mehta, Apoorva Goyal, Alex Debayo-Doherty | February 27, 2025| 12 min. read

With an estimated one billion cameras in operation globally, generating hundreds of petabytes of data daily, video intelligence systems contain a data goldmine that businesses are just beginning to harness. From massive enterprises to neighborhood storefronts, the ability to transform raw footage into actionable intelligence is reimagining how organizations safeguard people and property, manage operations, and make critical decisions.

Breakthroughs in cloud computing and AI — especially computer vision — are now bridging the gap between siloed, on-prem solutions and advanced, unified security ecosystems. The shift mirrors the skepticism of the early days of the cloud: Concerns over data privacy and reliability are giving way to the undeniable benefits of scalability, data interoperability, and real-time analytics. The result? A new wave of opportunity for those ready to capitalize on the future of video intelligence.

As computer vision systems become increasingly sophisticated at securing the built environment and augmenting corporate decision-making, the physical security ecosystem — including access control, emergency mass notifications, fire safety, visitor identity management, and more — is modernizing as well. These systems are starting to talk to each other and unify workflows in the cloud, enhancing overall security effectiveness.

In many ways, the video intelligence and video security ecosystems are akin to where AWS was in 2007 — introducing a radically new cloud service that initially faced skepticism about reliability at scale. Over the last decade, AWS has seen rapid adoption. Similarly, enterprises are starting to move their video data to the cloud for real-time AI analysis and deeper integration with other software and cloud-enabled hardware. This shift is unlocking previously impossible video-based workflow automation and business intelligence — and hinting that a ubiquitous video data cloud paradigm is coming.

Insight Partners believes video intelligence is on the cusp of a major cloud transformation. As enterprises migrate video workflows off on-prem systems, they’ll unlock better data interoperability to other cloud software, advanced AI computer vision, and lower-cost cloud hosting with cloud providers they already trust. This shift marks a profound reinvention — turning passive video data into powerful, scalable intelligence.

The industry

Video intelligence is an umbrella term for software and hardware that offers video-enabled visibility and analytics into a customer’s physical security and business operations. It’s a large and growing category with a $60B+ global TAM as of 2024, growing at around a 10% compound annual growth rate (CAGR). The broader physical security ecosystem has a reported TAM of over $120B, growing at a mid-single-digit CAGR.

As seen during COVID-19, Fortune 500 companies continued investing in video security systems without significant cuts. Video intelligence is largely recession-resistant and offers meaningful expansion opportunities, reflected in sticky, multi-year contracts with significant revenue growth potential per site, seat, or software module. Recently, cloud-native video software has empowered enterprises to secure their physical assets more cost-effectively and optimize front-of-house and back-office workflows in parallel.

​​Historically, on-premises (“on-prem”) deployments required cameras, servers, and storage hardware orchestrated by an on-prem client, which required lots of expensive capex to scale. As of late, two new implementation methods have emerged, making video intelligence solutions more scalable and interoperable:

  • Camera-to-cloud: Cameras on the edge (devices that can process and analyze data directly on the device itself) that can share video feeds directly to the cloud.
  • Edge-to-cloud bridges: Networking devices that create secure, high-speed connections between edge computing devices and centralized cloud platforms, promoting seamless integrations between already deployed legacy camera fleets and modern cloud-based video systems.
what is video intelligence
Note: For illustrative purposes only. This slide represents a simplified depiction of a complex process.

The different stages of video intelligence implementation have evolved as follows:

On-prem recording and storage

Legacy implementation is capital-intensive, difficult to scale, and only on-prem.

Camera-to-cloud

Popularized by Verkada with proprietary cloud-enabled hardware, camera-to-cloud is more scalable, but it can require expensive hardware that isn’t always compatible with already deployed systems and cameras.

Cloud NVR (Backward compatible)

Connects existing cameras to the cloud, enabling modern software and analytics to interface with legacy hardware. Additionally, with the recent rise of vision language models running more computationally intensive inference, cloud NVRs equipped with additional computational power for VLM workloads on the edge have become attractive. Without this compute, running continuous inference on the edge has been difficult. This illustrates that sometimes bringing more compute on-site is key to unlocking more advanced analytics.

Major players driving growth and modernization

In the last few years, meaningful support from major tech players and investors has driven the category’s modernization and growth, including:

  • Amazon: Launched new hardware to help bring modern ML models to legacy hardware
  • NVIDIA: Launched AI Blueprint, a technology enabling real-time video analysis through computer vision, which allows companies to analyze video content automatically, generate reports, and monitor safety compliance. NVIDIA Metropolis offers improved dev tools and a microservice framework for Vision AI apps

How system integrators affect the industry

Part of what defines the video intelligence industry is that channel partners or system integrators (SIs), like Convergint and Securitas, are the gatekeepers to distribution. Based on our conversations with industry experts, most video intelligence system sales go through these channels, sometimes to the tune of more than 90%*.

The physical security SI ecosystem has a $8.8B Global TAM in 2023, growing at a 9% CAGR. These SIs tailor-fit security solutions based on the enterprises’ specific needs, so enterprises demand adept integrators that can bridge digital and physical system deployment, especially in highly regulated end-markets like healthcare and government. As system complexity grows, SIs are offering more outsourced managed security services so enterprises can focus on their core operations and entrust physical and digital security to experts.

SIs are entrenched in the physical security ecosystem by:

  1. Building and maintaining strong relationships with chief security officers (CSO) responsible for physical security strategy and procurement.
  2. Offering deep expertise and support for a range of physical security deployments — on-premises, hybrid-cloud, and mobile — to help customers determine the most suitable IT configuration.
  3. Maintaining enduring and sticky relationships with end customers through ongoing maintenance work and system expansion scoping.

“Most traditional corporate security folks will not listen to anyone other than their integrator… it is difficult to get a seat at the table,” one expert noted.

“This means that significant distribution advantages accrue to players with scale and pre-existing channel relationships. New entrants must fight for channel partner distribution. They have to invest in acquiring and educating both channel partners and end customers on price, product quality, and support — resulting in a ‘double customer acquisition cost (CAC).’”

Selling directly to end customers at scale is difficult. Verkada tried to sell direct, but given challenges with direct sales at scale, they progressively moved into a hybrid direct and channel partnership model with the 2023 Verkada Premier Services Network launch and their successful partnership with the large system integrator Convergint, Verkada’s National Partner of the Year, in 2023.

An expert noted, “It’s about cementing those partnerships with your channel partners and integrators because customers take what they say as gospel. If integrators are confident in the product and can relay that to their customers, there’s a higher likelihood it will be taken seriously, rather than it coming directly from an OEM or a SaaS provider.”

All of this is to say: It can be crucial to gain the acceptance of the channel to unlock growth at scale.

Chronology

Recent breakthroughs in AI technology for video and image innovation create new opportunities in this sector. Before 2010, classical computer vision techniques focused on manual image feature extraction, which wasn’t scalable.

From 2014 to present, image and video deep learning started to take hold, punctuated by the development and release of TensorFlow from Google Brain and PyTorch from Facebook AI, which allowed image and video deep learning models to move from the lab into production environments with large scale datasets.

deep learning innovation ready for production environment
Source: Insight Partners Market Research.

 

Modern image and video models can be trained to identify anything a customer wants to be detected. This new level of video event understanding enables automated workflow triggers, customizable models for each business, and large efficiency gains from process automation.

Modern Vision Language Models (VLMs) enable users to “chat” with video content. These models, often Large Language Models (LLMs) fine-tuned with domain-specific images, can discern key objects and their relationships within a camera’s field of view, allowing for more robust image and video queries beyond traditional motion detection systems and primitive classification models inspired by AlexNet and early Convolutional Neural Networks (CNNs).

The boom in computational power and scalability has enabled a wide range of computer vision use cases, categorized below into video security and video business operations:

Video security

  • Weapon detection: Automatic alerts to law enforcement
  • Intrusion alert: Unauthorized entry and exit real-time alerts
  • Access control and ID verification: Face ID, video-enabled access permissions
  • Crime prevention and investigation: Evidence collection and crime deterrent
  • Crowd monitoring: Monitor crowds for bad actors (e.g., stadiums, concerts)
  • Regulatory compliance: Legal compliance, HIPAA compliance

Video business operations

  • Workplace safety: Detect hazards and unsafe behavior
  • Retail: Task compliance, customer experience
  • Restaurants: Kitchen efficiency, health and hygiene compliance
  • Parking lot admin: License plate recognition for automatic entry/exit
  • Field service: Work feedback, back-office workflow triggers
  • Manufacturing: Defect detection
  • Agriculture: Crop yield optimization, risk ID
  • Traffic safety: Traffic enforcement and fine collection

The investment landscape

The video intelligence landscape is large, fragmented, and ripe for consolidation. We’ve categorized it into the following buckets:

  1. Video 1.0 players: Camera, compute, storage, and other hardware OEMs
  2. System integrators: Services vendors that offer physical security system deployment, maintenance, and managed monitoring services.
  3. Video 2.0 players: Hybrid-cloud hardware and software vendors.
  4. Video 3.0 players: Cloud-native vendors offering the latest AI video event understanding technology in the cloud or on the edge.

To make the distinction clear, Video 2.0 vendors are the more established, scaled video intelligence incumbents focused on providing a software and hardware computer vision solution. They either deliver proprietary hardware bundled with their software or combine commodity hardware with their video analytics software.

video intelligence market map
Note: For illustrative purposes only. For a complete list of Insight’s portfolio companies, please visit https://www.insightpartners.com/portfolio/

 

Video 3.0 vendors are next-gen, AI-native disruptors that use specific hardware modalities or state-of-the-art (SOTA) computer vision AI to bring more intelligence to video systems.

Video 2.0

Video 2.0 vendors have scale and distribution advantages due to existing relationships with channel partners. They can operate both on-premises and in the cloud, allowing enterprise customers to transition their video analytics workloads to the cloud gradually.

While Video 2.0 vendors sometimes require proprietary hardware and software bundles at a high price point, most leverage the ONVIF security standard, which defines how IP-based security products communicate. This allows newer video management software and cameras to interface with already deployed older cameras, making good on end customers’ existing capex investments while still allowing them to modernize their core systems.

The Video 2.0 players offer a wide range of camera modalities from mobile deployment like LVT to proprietary hardware suites like Verkada to third-party hardware like Stealth Monitoring. On the software side, they enable remote access to live or archived video footage in hybrid-cloud environments and video analytics in their software suite.

Established Video 2.0 vendors offer attractive investment opportunities by leveraging their strong distribution advantages and long-standing channel partnerships. They now benefit from increasingly scalable system architectures — like camera-to-cloud hardware and edge-to-cloud bridges — allowing for more seamless camera fleet expansion at existing or new sites.

These vendors will be the most suitable platform anchor assets — meaning they are the core foundation for future acquisitions and growth. Given their scale and distribution advantages, they are empowered to acquire or develop new AI video analytics functionality as customers demand more intelligent systems as AI penetrates the enterprise.

Video 3.0

Video 3.0 vendors share many benefits with Video 2.0 vendors but also generally offer the latest AI video event understanding tech to differentiate themselves from Video 2.0 vendors.

The Video 3.0 ecosystem has exploded with new entrants in the last few years due to the generality and impressive performance of AI models in the visual domain. As previously mentioned, Video 3.0 vendors face challenges in gaining acceptance from channel partners for distribution. They must endure the aforementioned “double CAC” when educating channel partners and end customers on their products. Additionally, it’s hard to differentiate, build brand awareness, and activate customers when many new entrants may have a similar pitch.

Across the Video 3.0 segment of the video intelligence ecosystem, we see three approaches: horizontal vendors, vertical-specific vendors, and tooling vendors.

Truck and rail yards, aviation, traffic safety, restaurants and retail, construction, manufacturing, parking payments, agriculture, logistics, and field services represent some of the most significant opportunities. Each of these categories requires end-to-end video security functionality, vertical-specific workflows built into the platform to enable downstream workflow triggers (e.g., when you see this license plate, update the yard inventory accordingly), compliance and reporting, and project management tasks.

Additionally, a horizontal Video 3.0 player with versatile vision-language model (VLM) technology that adapts well across verticals could be a promising growth investment. Ideally, in the early days, they have a capital-efficient direct sales motion to get to end customers quickly to demonstrate product ROI and superior tech quality compared to incumbent vendors. As they reach scale, they’ll need to build channel relationships, but ideally, their brand reputation and product quality ease that transition.

These players can also capitalize on the more powerful next-generation compute resources and cameras that can run at the edge to do advanced AI analytics of real-time video footage (Intenseye, Coram AI), enabling exciting new use-cases in emergency response, crowd monitoring, facility supervision, and employee, health and safety (EHS).

There is a lot of opportunity at the tooling layer to augment existing video software. Some emerging trends include no code computer vision model builders, video semantic search, synthetic data platforms for vision model post-training, and video data integration into end-to-end threat visibility platforms — unifying video intelligence, access control, cybersecurity, and more. This means quicker deployment, easier customization, and richer insights across security and operations, paving the way for a more tangible business impact from video software.

The opportunity

Several company archetypes are already exploring these opportunities.

Video 2.0 consolidation anchors

Video 2.0 consolidation anchors offer hardware and software in a solution bundle and typically have scaled revenue, a diverse geographic footprint, and a clear path to achieving a high blended gross margin from economies of scale. They tend to operate at or near cash flow profitability and serve customers across multiple verticals. Their hybrid-cloud approach supports enterprises that have outstanding capex commitments and want to progressively move video analytics workloads from on-prem to the cloud. They are well-positioned for M&A to expand vertically and geographically, re-platforming their acquisitions to their VMS and analytics suite in the process.

Vertical integration of Video 2.0 players

Vertical-specific Video 2.0 platforms often pair video intelligence software and hardware with a focus on expanding into vertical software adjacencies to do more for the customer in their key end markets. These companies can easily branch into complementary verticals once they have sufficient scale in the core business — like moving from retail to quick service restaurants (QSRs) — and generally use M&A for product expansion. This integrated approach accelerates both market penetration and product depth within their chosen domains.

High-growth, differentiated, vertical-specific Video 3.0 players

Emerging, high-growth Video 3.0 players offer AI-native, vertical-specific computer vision platforms with some combination of differentiated camera modalities (Aquabyte) with vertical-specific software adjacencies woven into their product suite, positioning them as breakout leaders in their vertical niche (e.g., Focal Systems in retail, Obvio in traffic safety).

High-growth horizontal Video 3.0 players

Horizontal Video 3.0 players offer flexible vision-language model (VLM) technology that generalizes effectively across verticals. Ideally, they go to market with a capital-efficient direct sales motion with capex paid for upfront to allow them to efficiently reach end customers, demonstrate clear product ROI, and showcase their technological superiority over incumbents.

Picks-and-shovels tooling vendors

Finally, we see an opportunity in the tooling layer that enhances existing video software. These players enhance computer vision systems by introducing video semantic search (Prompt AI), synthetic data for training (Tonic), and no-code model builders (Landing AI, Viso AI).

Alongside the company profiles mentioned, the following Insight portfolio companies are innovating in video intelligence application layer software and infrastructure, as well as in visitor management and access control — areas intimately connected to the video intelligence ecosystem.

Intenseye provides workplace safety management solutions powered by AI-driven video analytics. Their platform enables organizations to proactively identify, monitor, and mitigate workplace hazards by analyzing real-time video footage from existing CCTV systems. Intenseye automates the detection of unsafe behaviors, compliance violations, and environmental risks, helping teams enforce safety protocols, reduce incident rates, promote compliance, and promote continuous improvement in workplace safety.

iLobby offers a comprehensive facility and visitor management platform, FacilityOS, designed to enhance safety, security, and compliance within complex enterprises. The platform streamlines visitor registration, contractor credentialing, and emergency evacuations through automation and real-time monitoring. By integrating with existing business tools and video security systems, iLobby enables organizations to efficiently manage facility access, track deliveries, and maintain regulatory compliance.

Landing AI specializes in artificial intelligence solutions, focusing on computer vision applications for industrial sectors. Their flagship platform, LandingLens, enables manufacturers to develop, deploy, and scale AI-powered visual inspection systems, enhancing defect detection and quality control processes. Founded in 2017 by AI pioneer Andrew Ng, the company serves biotech, pharmaceuticals, automotive, agriculture, electronics, and general manufacturing industries.

RapidSOS provides an intelligent safety platform that securely links life-saving data from over 500 million connected devices directly to 911 and first responders. Their solutions enhance emergency response by delivering critical information — such as real-time location, health profiles, and incident specifics — to emergency services during crises.

RapidSOS aims to improve situational awareness and response times by partnering with technology companies and public safety agencies, ultimately saving lives. RapidSOS does not directly employ computer vision tech, but it collaborates with partners like KamiCare and Local Security, which use computer vision to enhance emergency response services.

Tonic.ai provides a data synthesis platform that enables developers to create realistic, de-identified datasets for software development, testing, and AI model training. Tonic allows teams to work with high-fidelity data without compromising privacy or compliance. They specialize in generating synthetic data for structured and unstructured datasets, including text and images, which can support computer vision model training and fine-tuning efforts.

The considerations

Here are Insight’s key considerations for founders aiming to differentiate their business in the video intelligence space.

Leverage distribution relationships

Build strong ties with system integrators and form specific hardware partnerships, an advantage that typically benefits incumbents but can help accelerate GTM velocity for new entrants who have a better product and can gain the channel’s trust.

Differentiate by hardware modality

Offer unique hardware solutions (e.g., LVT’s portable security systems or Aquabyte’s aquaculture computer vision for fish farmers). Vendors can offer camera hardware configurations to capture video footage in ways a generic video security vendor couldn’t. Additionally, leverage edge compute resources to bring more compute power on-site to do more real-time AI analytics for your customer than a typical cloud-connected camera or VMS can handle.

Specialize by segment

Provide specific, industry-tailored computer vision software modules to stand out from horizontal video intelligence vendors via R&D build-out or M&A. Vertical players often expand into new segments over time, but if you are trying to scale the core business in a vertical, doing more for the customer by offering adjacent vertically focused software modules will help win deals and retain customers longer.

M&A considerations

The video intelligence ecosystem is a massive and fragmented category that presents compelling consolidation opportunities via vertical integration, vertical expansion, and AI-enabled computer vision software M&A.

  • Vertical software integration: Integrate with physical security, compliance, location-based marketing, visitor management, payments, HR tech, construction, and logistics software.
  • Specialty vendor tuck-ins: Use acquisitions to scale channel partner distribution and expand camera footprint.
  • AI-focused M&A: Pursue M&A in no-code computer vision model builders, video semantic search, and improved video event understanding for vertical-specific vision models.

The resources

If you want to learn more, here are some resources our team recommends:

  • AI Physec Today Podcast: A podcast covering the latest advancements, challenges, and use cases of AI in physical security and surveillance.
  • Intel, What is Computer Vision?: A high-level introduction to computer vision, explaining its core principles, applications, and how AI enhances its capabilities.
  • IBM, What is Computer Vision?: Breaks down the fundamentals of computer vision, highlighting IBM’s innovations and real-world use cases across industries.

Editor’s note: Insight Partners has invested in Intenseye, iLobby, Landing AI, RapidSOS, and Tonic.ai.

*Quotes in this article have been sourced from industry conversations and given with permission.