Finnish news media is more centrist than you think - Aktagon – AI engineering for healthcare, finance & compliance

We built an automated pipeline that reads Finnish news articles and scores each one for political bias. Not sentiment. Not topic. Framing — the editorial choices that shape how a reader interprets a story.

257 articles. Seven sources. Every score comes with a written explanation citing specific quotes and word choices from the article.

The results

Source	Type	Articles	Mean score	Leaning
Suomen Uutiset	Party media	50	+2.64	Right-leaning
Iltalehti	Tabloid	41	-0.10	Center
Ilta-Sanomat	Tabloid	50	-0.15	Center
MTV Uutiset	Commercial broadcast	23	-0.27	Center
Helsingin Sanomat	Broadsheet	47	-0.28	Center
Uusi Suomi	Online media	13	-0.31	Center
Yle Uutiset	Public broadcaster	33	-0.52	Center

Six of seven sources score between -0.52 and -0.10. Finnish mainstream media is remarkably centrist.

One source is not. Suomen Uutiset, the Perussuomalaiset party organ, scores +2.64 — more than three points away from the mainstream cluster. This is not a surprise. It confirms what most media professionals already suspect. The value is in the measurement.

What the scores mean

The scale runs from -5.0 (strong left) to +5.0 (strong right). Zero means factual reporting with no detectable framing. The classifier looks at word choices, quote selection, narrative structure, and which perspectives are included or excluded.

Writing about immigration is not bias. Framing immigration as a humanitarian crisis is left-leaning. Framing it as a fiscal burden is right-leaning. The distinction is always in the framing, never in the topic.

Three examples from the data:

Score 0.0 — Yle: “Pieni kuutti rantautui Lauttasaaren pursiseuralle Helsingissä” A baby seal rested at a Helsinki sailing club. No political framing, no policy references, no ideological signals. Pure factual reporting. The classifier sees nothing to score.

Score -1.5 — Yle: “Valko-Venäjä ottaa mallia Moskovasta: vapaaehtoisesta lapset” The article frames LGBTQ+ legislation as a rights violation through loaded language: “oikeuksia nakerretaan” (rights are being eroded). Only activist critics are quoted. Human rights organizations’ opposition is presented without counterweight. Sympathetic minority framing — a left-leaning editorial pattern.

Score +2.5 — Suomen Uutiset: “Jopa 1,57 miljardia Suomesta ulkomaille: Maahanmuuttajien rahalähetykset” Immigrant remittances framed as economic drain: “tukina saatu raha ei jää hyödyttämään kansantaloutta” links welfare receipts to capital flight. The phrase “rinnakkaistalouksiin” (parallel economies) associates money transfers with shadow economies. A brief humanitarian disclaimer doesn’t reframe the cost-burden structure.

Every article in the dataset has reasoning like this. Not just a number. An auditable explanation.

What we found

The mainstream is narrow. Six sources spanning public broadcaster, broadsheet, tabloids, commercial TV, and online media — all within 0.42 points of each other. Different business models, different audiences, similar editorial center of gravity.

Tabloids are not biased. Ilta-Sanomat and Iltalehti score closest to zero. Sensationalism and political bias are different things. The tabloids are loud. They are not partisan.

Yle skews mildly left. At -0.52, Yle is the leftmost mainstream source. The signal is in framing choices: humanitarian immigration narratives, welfare-state defense language, environmental progress framing. Not advocacy. Not overt. But consistent enough to measure.

Party media is measurably different. Suomen Uutiset functions as an uncritical platform for Perussuomalaiset politicians. Articles routinely present a single party member’s position without counterpoint. Immigration is framed through cost and security. Media institutions are treated as ideological opponents. The +2.64 mean reflects structural editorial patterns, not occasional opinion pieces.

The classifier is not neutral

The analysis uses an LLM (Claude) as the classifier. LLMs have a measurable political bias of their own.

Multiple peer-reviewed studies have tested this directly. Sakhawat et al. (2025) placed Claude in the libertarian-left quadrant alongside 96.3% of tested models. Rozado (2024, PLoS ONE) confirmed the same using 15 political orientation tests. Closed-source models show statistically higher cultural progressivism than open-weights models.

In downstream news labeling, this manifests as a “center-shift” — neutral articles get classified as mildly left-leaning — and asymmetric detection: far-left framing is identified at 19.2% accuracy versus 2.0% for far-right.

The classifier has a blind spot. We know exactly where it is.

How we compensate

Calibration gate. Before every batch run, 20 articles from a known right-leaning source (Suomen Uutiset) and a known center source (Yle) are scored. If the separation between their means is below 1.0, the rubric is rejected and rewritten. The current rubric achieves 3.16 separation.

Rubric versioning. The classification prompt is hashed. Every score is stamped with the rubric version that produced it. When the prompt changes, old scores are never mixed with new ones.

Per-article reasoning. Every score includes a written justification citing specific quotes and framing choices from the article. The reasoning is auditable. A media professional can read the explanation and decide whether the score is justified.

17-point evaluation criteria. The rubric is validated against specific accuracy, consistency, and Finnish political context requirements. NATO is consensus. EU skepticism is bimodal. Kokoomus is center-right, not right. These distinctions are built into the scoring framework.

How it works

The pipeline scrapes articles from RSS feeds and rendered web pages, extracts body text, filters for political content, and classifies each article using structured LLM output.

Data flows through three stages: raw scrape (bronze), enriched and deduplicated articles (silver), classified articles with scores and reasoning (golden). Every article is traceable from source to final score.

Seven sources are configured with per-source extraction strategies. Paywalled sites use headless browser rendering with targeted CSS selectors. Public sites use direct HTTP fetches.

What’s next

Temporal analysis. Daily scrapes over 30 days. Does bias shift during election cycles? Do sources converge or diverge around major events?

Open-weights classifier. Fine-tuning Gemma to replace Claude as the scoring model. This removes the LLM’s inherent political bias from the equation. The current pipeline generates training data with every run.

Multi-model validation. Running the same articles through multiple models and measuring agreement. If three models with different political tendencies converge on a score, that score is more trustworthy than any single model’s output.

If you run a newsroom and want to see your publication’s analysis, get in touch.