Building Superhuman Judgment: Real-Time Video Moderation at Scale

When Split-Second Decisions Go Live to Millions

My third week at Citizen, I watched Lily make an impossible choice.

3:47 PM. Someone hit "Go Live" in Brooklyn. A robbery was happening. Right now.

3:47:03 PM. The video appeared on Lily's monitor. Shaky camera. Shouting voices. Someone being chased.

3:47:08 PM. Lily had to decide: Does this stay live for hundreds of thousands of New Yorkers, or do we cut the feed?

Five seconds. That's all she had. No preview. No context. No buffer. Just live video and the knowledge that whatever decision she made would instantly reach millions of people.

I watched her hover over the "BLOCK" button. Her hand was steady — but she wasn't sure, and you could tell.

"This is insane," I thought. "Nobody should have to make these calls this fast."

The Impossible Job I Didn't Know Existed

Citizen's core promise was unfiltered, real-time awareness of what was happening in your city. Users could broadcast active crimes, accidents, fires—anything happening live.

But "unfiltered" created a terrifying problem:

Sometimes "real-time" meant broadcasting active violence, victims in distress, or graphic crime scenes directly to people's phones during lunch.

Every inappropriate broadcast was a potential lawsuit, FCC violation, or PR nightmare. The legal exposure was enormous.

So Lily and her team had developed this superhuman skill: Make split-second judgments about traumatic content under massive legal and business pressure. Do this hundreds of times per shift. Never make a mistake.

And they thought this was just... part of the gig.

Watching the Psychology Break Down

I started shadowing the moderation team during live events. What I saw was brilliant people doing impossible work—and slowly burning out from the pressure.

How the system worked:

User hits "Go Live"
Video instantly appears on operator screens — and live in the app.
2-3 seconds to decide:

Do we block it?
Is this wrong to show?
Is this even relevant video?
What incident is this video for?
Will we get sued for showing this?

No second chances — it's immediately live to millions, and once it's blocked it's gone forever
No context—just raw video and pure reaction

Decision fatigue set in fast. During high-volume events, operators became afraid to approve anything borderline. A huge amount of legitimate emergency broadcasts by supefans of the app who wanted to help were getting killed unnecessarily — because it was safer than making a huge mistake.

"I can't relax during my whole shift," one moderator told me. "Every time that a stupid video pops up, it makes my heart race."

"Every live video feels like a potential disaster," the operations manager said. "I'm just waiting for the thing that's going to get us in trouble."

They weren't just filtering videos—they were protecting users, communities, and the company from harm. But the tools gave them no time to think.

The Breakthrough: Real-Time vs. Thoughtful

Watching Lily's hand hovering nervously over that button, I had a realization:

"Real-time" to the user didn't have to mean "instantaneous" to our moderation system.

What if we could give operators the time to make thoughtful decisions while users still experienced seamless, "live" broadcasting?

The insight: What if we adedd a 10 or 15 second buffer that felt natural to users — like video loading — but gave operators the breathing room they desperately needed?

Users would experience: Smooth, (essentially) immediate broadcasting
Operators would experience: Time to assess context and make informed decisions
The company would experience: Dramatically reduced legal risk

Operators could have a chance to watch, think, breathe, and verify whether it seemed legit.

The key: Design the delay so that to users, it'd feel like technology, not moderation.

Building Tools That Support Human Judgment

I embedded with the team during actual emergencies to understand what they really needed. Most inappropriate content had predictable warning signs in the first few seconds:

Audio cues often indicated violence before visuals
Camera movement patterns suggested chaos or danger
Location data could predict higher-risk situations
User history showed patterns of problematic broadcasting
Connection to incident details meant less hunting for context

So I rebuilt the moderation interface around three key innovations:

Grid-Based Workflow: Instead of one-at-a-time decisions, operators could see multiple video streams simultaneously with full incident context—location, user info, and incident type—integrated directly into each video card.

Reversible Blocking System: Originally, blocking a video killed it forever—gone from the grid, no take-backs. I redesigned it so blocked videos stay visually reduced in the grid, letting operators reverse their decision if they realized they made a mistake.

Verify Button: New approval system where operators could put their stamp on legitimate content, showing they'd actively reviewed something and confirmed it was worth broadcasting or relevant.

15-Second Buffer with Countdown: Visual indicator showing operators exactly how much buffer time remained before auto-approval. Videos automatically went live with a green checkmark if not blocked—no more panicked split-second decisions. And of course, they could still block if things went sideways — 15 seconds before users would see anything they shouldn't.

The insight: Give operators comprehensive information, reversible decisions, and enough time to think. Uncertainty creates stress — simple context and a little control reduced both.

Building Tools That Actually Work Under Pressure

The technical challenge was straightforward but critical: Build reliable tools that don't crash during emergencies while handling hundreds of simultaneous live video streams.

What we actually built:

Rock-solid frontend architecture that replaced the crashing prototype systems
15-second buffer system with visual countdown indicators
Grid-based interface showing multiple streams with full incident context
Reversible blocking so operators could correct mistakes
Verify system for operators to approve legitimate content

The foundation: Rebuilt on modern, scalable architecture as part of the broader Citizen internal tools transformation—no more crashes when operators needed the tools most.

The Moment We Knew It Worked

A few months after deployment, there was a very public, very sensitive incident in the city. Someone was threatening to jump, and it was drawing massive attention—both from people on the ground and from Citizen users watching the situation unfold.

Multiple people started broadcasting live video from the scene.

Lily was monitoring the streams in our new grid interface. She could see the incident context right there with each video. The 15-second buffer gave her time to watch what was happening. Users were providing valuable real-time information—witness perspectives that wouldn't come through police radio alone.

Then the situation escalated.

Lily saw what was about to happen before users did. Thanks to the buffer, she had those crucial seconds to assess and act. She blocked the stream just in time—protecting hundreds of thousands of people from witnessing something traumatic.

The users never knew. To them, it looked like the user just stopped broadcasting at a certain point. The official updates were delicately worded and carefully updated, so they understood how the situation played out and felt informed. The broadcast had provided valuable information right up until the critical moment, then seamlessly our operated was able to end it before showing anything inappropriate.

Lily made the right call at exactly the right time—calm, informed, confident. No shaking hands. No panic. Just professional judgment supported by tools that actually worked.

That's when I knew we'd solved the right problem.

The Real Transformation

Months later, I asked Lily about that day with the sensitive incident. "I actually had time to think through what I was seeing," she said. "Before, I would have been terrified I'd make the wrong call in those few seconds. This time, I could watch, assess, and act when it was right."

The change wasn't dramatic or obvious. No celebration, no fanfare. Just operators doing incredibly difficult work, finally supported by tools that worked as hard as they did.

The transformation was quiet but profound: from reactive panic to thoughtful protection. From fighting broken software to focusing on their real mission—keeping users informed, and more importantly, safe during the moments that mattered most.

That's what good tools should do. Not replace human judgment, but give the people doing impossible work exactly what they need to do it well.

The Bigger Picture

This video moderation system was part of the complete emergency response suite I rebuilt at Citizen:

Emergency incident command center - Real-time crisis coordination
Emergency audio processing - Real-time police radio analysis

Together, these systems transformed Citizen's operations from prototype-stage experiments into production-ready tools serving nearly 500,000 users across NYC during real emergencies.

But the real victory: Taking people doing impossible work and finally giving them the tools they deserved to do it well.

Sometimes "real-time" to users means giving the people behind the scenes enough time to make the right decisions. The best technology makes impossible jobs possible—not by automating away human judgment, but by supporting it.