My talk at the Artificial Intelligence, Manipulation, and Media Literacy Summit in Cyprus

I attended the Artificial Intelligence, Manipulation, and Media Literacy Summit in Cyprus yesterday. 

Session 3 was in a question-and-answer format, which I prefer to just speaking.

I asked Claude.ai to format my answers. I have to highlight that this formatting misses the anecdotes and lively occasions. Well, those who attend get the benefits. Still, for those who are here, this is what I talked about. It was in 2017 that I was last in Cyprus, and it was nice to catch up with some friends from that time. Besides, it was lovely to meet colleagues from the Cyprus Eastern Mediterranean University with whom we will be in touch. I could also listen to some Greek colleagues whose works I knew from the fact-checking circles. 

1- Beyond misinformation detection, how are AI systems themselves
reshaping what counts as credible knowledge — and who gets to define
it?

The deeper issue isn’t that AI spreads misinformation — it’s that AI systems are quietly becoming the arbiters of epistemological legitimacy. When a language model decides what counts as a “mainstream” view versus a “fringe” one, it isn’t doing neutral curation — it’s encoding a particular hierarchy of credibility that reflects the institutional sources dominant in its training data.

Think about what “credible” meant before: peer review, editorial gatekeeping, source triangulation. AI systems replicate those signals structurally without the accountability mechanisms that came with them. The result is a new epistemic authority that is both more pervasive and less legible than any previous one.

A useful framing for the audience: Foucault’s “regime of truth” — every society has mechanisms for distinguishing true from false, and those mechanisms are never innocent. AI is becoming a new such mechanism, but one that is privately owned, opaque, and globally scaled.

Who defines credibility? Effectively, the organizations with the infrastructure to train frontier models: a handful of US and Chinese tech corporations. This is not a technical question — it’s a geopolitical and democratic question.

2- Large language models are trained on massively uneven data —
certain languages, cultures, and worldviews dominate. How does this
structural imbalance affect whose reality gets normalized in
AI-mediated information environments?

This is where your epistemic colonialism framework lands directly. The data asymmetry isn’t accidental — it’s structural. English accounts for roughly 45–50% of Common Crawl; Turkish, Greek, Arabic together are a fraction of that. This means:

  • Concepts that exist in those languages but lack clean English equivalents tend to get flattened or lost
  • Historical narratives about the same events (say, the Ottoman period, or Greek-Turkish relations) are represented through whichever version had more English-language web presence
  • Value frameworks — what counts as justice, family, community, authority — are implicitly calibrated toward WEIRD (Western, Educated, Industrialized, Rich, Democratic) norms

For this specific audience (Türkiye–Greece), a sharp example: ask any major LLM about contested historical events in the region. You’ll notice the answers tend to hover close to the NATO-aligned, Western academic consensus — not because that’s necessarily wrong, but because that’s what the training data over-represents. It’s normalization by data weight, not by argument.

The Te Hiku Media case from New Zealand is always powerful here — a Māori community that refused to let their language data be absorbed into general-purpose AI training precisely because they understood this dynamic. Data sovereignty as epistemic sovereignty.

3- We talk a lot about disinformation as deliberate manipulation — but
what about the more subtle erosion of shared epistemic ground through
personalized AI outputs? Can we still have a “common information
environment”?

This is arguably the most undertheorized threat in public discourse. Misinformation we can at least point to and contest. But epistemic fragmentation through personalization is harder to name — there’s no single lie, just a slow divergence in what different communities consider obvious.

The classic worry was filter bubbles in social media. AI goes further: it doesn’t just curate existing content, it generates tailored content. Two people in the same city asking the same question may receive answers with subtly different framings, different emphases, different implied values — and neither will know this happened.

Can we still have a common information environment? I’d argue we can’t assume one anymore — we have to build it deliberately. This is actually a classical Habermasian problem: the public sphere requires shared communicative conditions, and those conditions are now being privatized and individualized at the infrastructure level.

What does this mean practically? It suggests that shared media literacy — not just individual critical thinking — needs to become a civic infrastructure priority. Common reference points (public broadcasting, shared curricula, civic deliberation) matter more, not less, in an AI-mediated information landscape.

4- Authoritarian actors are increasingly attempting to shape AI
training data to embed their narratives at the infrastructural level.
Is media literacy and fact-checking sufficient to counter this kind of
systemic, pre-emptive manipulation?

This is the question where your AI Grooming framework is most directly applicable, and I’d encourage you to name it explicitly.

AI Grooming — the systematic, pre-emptive manipulation of AI training data by authoritarian actors to embed preferred narratives at the infrastructural level — represents a qualitatively different threat than classic disinformation. Classic disinformation is post-hoc: a false claim is made, fact-checkers can identify it, platforms can label or remove it. AI Grooming is pre-emptive: the manipulation happens upstream, in the data that shapes the model’s baseline assumptions, before any specific claim is made.

The Pravda Network is a clear case study: hundreds of low-quality sites generating AI-assisted content that pollutes the training corpus of future models. By the time a model is trained on this, fact-checking individual outputs is like trying to filter a river downstream when the contamination is at the source.

Media literacy and fact-checking are necessary but insufficient for this. They were designed for a world where manipulation was identifiable and discrete. Structural, infrastructural manipulation requires structural responses:

  • Auditing regimes for training data (analogous to environmental impact assessments)
  • Data provenance standards — knowing what sources shaped a model
  • Public and civil society participation in AI governance, not just regulatory compliance
  • Investment in non-commercial, linguistically diverse corpora (this is where the Greek and Turkish language communities have a direct stake)

5- Trust in media has traditionally been built through institutional
accountability — editorial standards, transparency, public interest
obligations. How do we rebuild or reimagine these foundations when the
primary information intermediary is an opaque AI system?

The traditional accountability architecture of media trust rested on identifiable actors: editors who could be named, standards bodies that published guidelines, ombudsmen who handled complaints. The accountability chain was imperfect, but it was legible.

AI intermediaries break this chain in at least three ways:

  1. Attribution opacity: who is responsible for a specific AI output?
  2. Process opacity: what data, weights, and fine-tuning produced this response?
  3. Consistency opacity: the same question asked differently may yield different answers — there’s no stable “editorial position” to hold accountable

So how do we rebuild? A few possible directions worth raising:

  • Algorithmic transparency requirements: not full model disclosure (which raises its own problems), but meaningful disclosure of training data sources, content policies, and known failure modes — analogous to nutritional labeling
  • Independent audit infrastructure: third-party technical audits of AI systems used in news, public information, and civic contexts
  • Platform liability reforms: extending accountability frameworks to AI systems as information intermediaries
  • Re-centering public media: public broadcasters, despite their flaws, have public interest mandates and democratic accountability structures that commercial AI firms lack. In both Türkiye and Greece, the question of what public media could mean in an AI era is live and urgent.

 


Discover more from Erkan's Field Diary

Subscribe to get the latest posts sent to your email.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.