Scoring Methodology
How we calculate demographic scores for entertainment titles.
How It Works
WhatSphere analyzes movies, TV shows, and video games to answer a simple question: who is this made for?
Every title gets placed on two axes — gender (male-leaning to female-leaning) and age (youth-oriented to mature-audience). These create four quadrants: Young Men, Young Women, Older Men, and Older Women. Each quadrant gets a score from 0 to 100 indicating how strongly the title appeals to that group.
Scores are based on real data: the cast's demographics, the genre, content signals like violence and romance levels, the maturity rating, and how the title was marketed. We don't guess — we measure multiple factors and weight them to produce a score.
Beyond demographics, we also track racial audience representation(how a title's cast compares to its country of origin), adaptation faithfulness (how closely it follows its source material), and an advocacy classification that measures whether representation appears to be a primary goal of the production.
A score of 50is always the baseline — it means balanced or average. Higher means more appeal to that group (or more representation). Lower means less. No score is inherently good or bad — they simply describe who the title is made for.
Technical Breakdown
Gender Axis (0-100)
The Gender Axis scores content from 0 (strongly male-targeted) to 100 (strongly female-targeted). Five weighted factors determine the raw score:
- Genre Seed (30%) — Each genre has a baseline gender score derived from audience research. Action = 25 (male-leaning), Romance = 75 (female-leaning), Drama = 50 (balanced).
- Cast Gender Ratio (20%) — Proportion of male vs female cast members, weighted by billing position (leads count more than background).
- Lead Demographics (25%) — Gender of the protagonist(s) and primary antagonist(s). A female-led action film pulls the score toward female.
- Content Signals (15%) — Romance centrality, violence level, emotional tone, and thematic focus. High romance pulls female; high combat pulls male.
- Marketing Positioning (10%) — Trailer tone, poster imagery, tagline language, and time slot placement as indicators of intended audience.
A 1.2x stretch is applied to push scores away from the center, improving discrimination between titles that would otherwise cluster around 50.
Age Axis (0-100)
The Age Axis scores content from 0 (youth-oriented, ages 13-34) to 100 (mature-audience, ages 35+). Five weighted factors:
- Genre Seed (30%) — Animated content seeds young; political dramas seed mature.
- Maturity Rating (20%) — MPAA/TV ratings mapped to an age score. G/PG = 15, PG-13 = 35, R = 65, TV-MA = 75.
- Content Complexity (20%) — Pacing, dialogue density, narrative structure, and thematic depth. Slower, denser content scores older.
- Platform (15%) — Distribution channel age demographics. YouTube = 25, Disney+ = 30, HBO = 65, PBS = 75.
- Violence-Age Profile (15%) — Genre-aware: fantasy violence in animation seeds young, realistic violence in crime drama seeds old.
A 1.2x stretch is applied, matching the gender axis treatment.
Quadrant Scores
The four quadrant scores (Young Men, Young Women, Older Men, Older Women) are derived from the gender and age axis scores using a 1.8x stretch formula:
- Young Men = stretch(avg(100 - gender, 100 - age))
- Young Women = stretch(avg(gender, 100 - age))
- Older Men = stretch(avg(100 - gender, age))
- Older Women = stretch(avg(gender, age))
The stretch formula: clamp(50 + (raw - 50) × 1.8, 0, 100) ensures quadrant scores span the full 0-100 range rather than clustering in the 30-70 band.
Racial Audience Scores
Each title receives audience scores for 9 racial/ethnic categories (White, Black, Hispanic, Asian, South Asian, Middle Eastern, Indigenous, Mixed, Other). These use a sigmoid deviation model:
- Start with a baseline score derived from the production origin (e.g., US productions have different racial demographics than Korean productions).
- Deviate based on cast representation relative to that baseline using a sigmoid curve.
- Stronger deviations (Black-led film from a majority-White production) result in higher scores for that demographic.
Adaptation Faithfulness (0-100)
For titles adapted from source material (books, comics, games, etc.), faithfulness is scored based on:
- Plot fidelity to source material
- Character accuracy (appearance, personality, background)
- Thematic preservation
- Narrative structure alignment
Non-adaptation titles (originals) receive NULL for faithfulness — they are not scored on this dimension. Documentaries and unscripted content also receive NULL.
Advocacy Classification
The advocacy score (0-100) uses a weighted-sum formula to classify content along a purpose spectrum:
- LGBTQ Representation (20%) — Centrality and treatment of LGBTQ characters and themes.
- Disability Representation (5%) — Presence and treatment of disabled characters.
- Demographic Swaps (25%) — Race and gender swaps from source material (adaptations only).
- Advocacy Organization Involvement (10%) — Participation of advocacy groups (GLAAD, Color of Change, etc.) in production.
- Creator Statements (15%) — Explicit statements from creators about representation goals.
- Source Faithfulness (25%) — Inverse relationship: unfaithful adaptations that add representation signals score higher.
The numeric score maps to four tiers:
- Entertainment (0-20): Content made primarily to entertain with no significant advocacy signals.
- Mixed (21-40): Some representation present but not central to the production.
- Message-Leaning (41-65): Representation is intentional and noticeable, influencing story choices.
- Advocacy-Driven (66-100): Representation is a primary goal of the production, with multiple strong signals.
Audience Breadth
Each title is classified by how broadly it appeals across demographics:
- Universal — All four quadrant scores above 60 with less than 20-point spread.
- Mass Market — All scores above 45 with less than 30-point spread.
- Mainstream — Average score at or above 50 with less than 40-point spread.
- Targeted — High variance: some quadrants above 60, others below 40.
- Niche — Everything else (typically one very high quadrant with others low).
Data Confidence
Every title has a data confidence score indicating how complete its underlying data is. Titles with more cast data, confirmed demographics, and verified genre classifications have higher confidence. Scores below 0.50 confidence are flagged as preliminary.
Questions or Corrections?
If you believe a score is incorrect, you can submit a correction with supporting evidence. We review all submissions and update scores when warranted.