Blog

How Interview Copilots Work

What it actually takes to build an invisible AI interview overlay — capture, transcribe, route, render — explained from inside Cloak's codebase.

Four jobs, four hard problems

Strip away marketing and an interview copilot is four pipelines that have to run in lockstep with sub-second latency: capture, transcribe, route, and render. Each one has a non-obvious failure mode.

1. Capture

You need both sides of the call. On macOS that means:

  • System audio: what the interviewer is saying, coming out of your speakers or headphones. Captured via CoreAudio tap (macOS 14.4+) or ScreenCaptureKit audio.
  • Microphone: your own voice, for context and for marking when the interviewer is speaking vs you.
  • Screen: the editor / question / slides currently visible.

Failure mode: Electron-based overlays often go through web audio APIs that fight with the meeting tool. Cloak uses the Rust cpal crate for microphone and a native CoreAudio tap for system audio, so neither competes with Zoom for the device.

2. Transcribe

Latency budget for live transcription is ~250 ms per segment. Anything slower and the answer arrives after you've started talking.

Real-world choices:

  • OpenAI Whisper hosted via OpenAI — accurate, ~400–600 ms latency on short clips. Default in Cloak.
  • Groq-hosted Whisper — same accuracy, ~150 ms latency. Best option when available.
  • ElevenLabs Scribe — strong diarization, ~250 ms.
  • Local whisper.cpp — slower (1–2s) but offline.

Cloak's STT layer is pluggable so you can pick per-task. The transcript pipeline also runs a rolling de-duplication pass (the kind of thing where "I think I think" becomes "I think") because real-world STT stutters on word boundaries.

3. Route

"Route" is the unsexy heart of an interview copilot. The user pressed a hotkey. What does the system send to the model?

  • The last N seconds of transcript (typically 30–60).
  • An optional screenshot encoded as base64.
  • A system prompt corresponding to the active persona.
  • Resume + JD as a tool-injected context block (Pro feature).
  • An intent classifier output: was that a question, a follow-up, or thinking aloud?

Get the routing wrong and the model answers the wrong question. Cloak's intent classifier is a fast small-model call that runs in parallel with the main model call. If intent is "thinking aloud" the main call gets cancelled.

4. Render

The overlay window has to be:

  • Always on top.
  • Non-capturable by screen sharing (this is the whole point).
  • Streaming tokens at ~30 fps so reading feels live.
  • Resizable based on content without flicker or scroll-jump.

On macOS the non-capturable property is a real OS-level guarantee. You set sharingType: .none on an NSPanel via tauri-nspanel and the window server itself filters the surface out of every capture API. There is no CSS / JavaScript way to fake this on Windows or Linux — which is why Cloak is macOS only on purpose.

The streaming token render is harder than it looks. Markdown reflow during stream causes layout thrash; framer-motion exit animations during the same period cause the window to "remember" a stale measured size. Cloak's useWindow hook measures the live content root with a ResizeObserver and dispatches Tauri IPC resize calls clamped to OVERLAY_MAX_HEIGHT so the overlay never grows past the user's screen.

The thing you can't engineer around

All this delivers you an answer. It cannot deliver you composure. Interview copilots that promise "auto-answer" are lying — there is no way for an external tool to inhabit your voice, your body language, or your willingness to admit you don't know something. The best overlays keep you sharp and let your real ability come through faster, not pretend to replace it.

Built into Cloak

Every architectural choice above is in Cloak's source on GitHub. If you want to see exactly how an interview copilot is built — read the source. If you want to use one — download Cloak.

How to install Cloak

macOS · 4 quick steps

  1. 1

    Extract the ZIP

    Open Cloak.zip from your Downloads folder. Double-clicking it will extract automatically.

  2. 2

    Move to Applications

    Drag Cloak.app into your /Applications folder.

  3. 3

    macOS security check

    macOS may warn that it can't verify the developer. This is normal for unsigned indie apps — it's not malware.

    "Cloak.app" can't be opened

    Apple cannot check it for malicious software.
    This item is on the disk image.

    Cancel
    OK

    If you see this, use the fix in Step 4 below — it removes the quarantine flag instantly.

  4. 4

    One-line fix (if blocked)

    Open Terminal (press ⌘ Space, type "Terminal"), paste this command and hit Return:

    Terminal — zsh
    $ xattr -cr /Applications/Cloak.app

    This removes the quarantine attribute macOS attaches to downloaded files. Cloak's source is open source — inspect it any time.

Need help? Open an issue on GitHub →