Content Team Explainer

Video Relevancy Flow

How a TikTok moves from raw sync data into format, relevance, content label, production owner, and recreation score for Oncourse.

Pipeline

The current code runs enrichment in a shared helper used by automated sync, manual classification, single-account sync, and backfill scripts.

01

Raw TikTok data arrives

Apify/TikTok sync stores caption, hashtags, duration, sound, cover, web video URL, stats, carousel flag, subtitles, slideshow image links, and raw metadata.

02

Media is prepared

When possible, video and cover assets are copied to storage. Hosted video is sent to Gemini; otherwise the thumbnail is used as visual context.

03

Format is detected

Deterministic signals run first: slideshow/photo mode becomes carousel slideshow; subtitles become voiceover. If still unknown, Gemini does legacy format analysis.

04

Strategy prompt runs

Gemini gets the Oncourse app context, video metadata, format hint, and media. It returns relevance, label, owner, score, reasons, hook, script, CTA, and remake plan.

05

Fields are saved

Relevant videos keep labels, score, reasons, and proposal. Irrelevant videos keep a relevance reason but clear content/production labels and use score 0.

Relevance Gate

This is the heart of the system. The model is allowed to say yes when the idea can credibly become Oncourse content, but it must not invent a med-prep bridge.

Relevant When

  • Direct exam prep, study strategy, resource comparison, weak-topic review, or clinical reasoning.
  • Med-student routines, rotations, burnout, overwhelm, time pressure, confidence, or productivity with a real study-performance bridge.
  • The bridge is evidenced by caption, hashtags, visible text, spoken content, or clear visuals.
?

Not Relevant When

  • Generic motivation, religion, lifestyle, entertainment, fashion, food, shopping, or creator personality.
  • Premed admissions or AMCAS content.
  • Med-school identity or humor with no academic, prep, productivity, or study-system payoff.

Recreation Score

The score is 0-100. Gemini can return an explicit score; otherwise code computes it from five 0-20 rubric values.

Formula

audienceFit + featureFit + adaptationEase + (20 - executionRisk) + (20 - creatorDependence)

Audience FitHow directly this speaks to Oncourse users.
Feature FitHow clearly the idea maps to actual Oncourse features.
Adaptation EaseHow easily the idea can become Oncourse content.
Execution RiskPenalty. 0 is easy; 20 is awkward or risky.
Creator DependencePenalty. 0 is generic; 20 needs a specific creator persona.

Labels

These labels help the team decide how to remake or learn from a post.

Format

CAROUSEL_SLIDESHOW, UGC_VOICEOVER, UGC_REACTION, or OTHER. Current heuristic only guarantees slideshow and subtitle voiceover; reaction/other may come from Gemini fallback.

Content Label

REACTION, RESOURCES, DAY_IN_LIFE, EXPLAINER, SKIT, or OTHER. Only saved when the post is relevant.

Production Label

TEAM_RECREATE for resource roundups, study-system explainers, slideshows, and feature-led ideas. CREATOR_RECREATE for day-in-life, skits, face-led stories, and personality-led reactions.

What The Team Can Tweak

These are the highest-leverage prompt knobs. Changing these shifts what gets marked relevant and how recreate ideas are ranked.

Add Good Patterns

Add examples of content you want the system to keep: specific student pains, creator formats, resource angles, or study-system hooks.

Add Rejection Patterns

Add examples of false positives: vague motivation, med identity without payoff, lifestyle posts, or trends that do not help content strategy.

Reweight Scoring Taste

Push toward easier team-made posts by penalizing creator dependence and execution risk more strongly in the prompt language.

Prompt Popouts

Open these to review the exact decision language your content team can edit or comment on.

Main Strategy Prompt

Used for relevance, content label, production label, scoring, hook/script/CTA, and recreate proposal.

Legacy Format Prompt

Used only when deterministic format signals do not produce a format or forced analysis asks Gemini to classify format.