Fake News Detection

Click here to explore the dashboard on fake news detection systems

Tech Champion: Marco Anders

In recent years, the dissemination of fake news has been brought more and more into the spotlight as it has been massively used to disseminate political propaganda, influence the outcome of elections or harm a person or a group of people.

Highly sophisticated applications (bots) are organised in networks and massively spread to amplify fake news over social media in the form of text, images, audio or video files. Often, these bot nets happened to be organised by foreign state actors, trying to obscure the originator.

Fighting fake news is extremely challenging, as:

in a democracy, freedom of speech is a fundamental right fostering media independence and pluralism; however, sometimes there is a very subtle line between separating unconventional personal views and claims of truth from fake news;
fake news can be detected by checking consistency of the news with different domains, such as technical background to discover the real sender or social and/or judicial background (for example: what is the intention of the fake message, e.g. putting harm on a person or a group); therefore, fact-checking requires having awareness on different contexts and the availability of reliable sources;
the sheer mass of fake news spread over social media cannot be handled manually.

Manual fact checking can address some of these challenges, for example when checking the consistency of news in different contexts. However, manual fact-checking is too slow to cover big information spreaders such as social media platforms. This is where automation comes into play. Automated fact-checking tools often combine different methods, for example artificial intelligence, natural language processing (analysing the language used) and blockchain. As regards to fake news embedded in images and videos, the tools often combine metadata; social interactions; visual cues; the profile of the source; and other contextual information surrounding an image or video to increase accuracy.

Algorithms are trained to verify news content; detect amplification (excessive and/or targeted dissemination); spot fake accounts and detect campaigns. Often, the fake news analysis process applies several algorithms sequentially. However, effectiveness of these algorithms is yet to be improved.

Even if fake news is spread heavily on social media, research has found that human behaviour (“word of mouth” marketing) contributes more to the spread of fake news than automated bots do. This shows that fighting the fake news sender is not the only approach. It also makes sense to increase the resilience to fake news on the side of the recipient and our society. Therefore, another important pillar of fake news detection is to increase citizens’ awareness and media literacy.

Positive foreseen impacts on data protection:

Awareness and media literacy will be raised at consumer level with an effect on data protection: the European Union has already launched a number of projects to analyse the phenomenon of fake news and develop countermeasures. As a result, one pillar identified is to increase awareness and media literacy. Such awareness-raising initiatives may have a positive impact on data protection in general: media literate consumers are capable of reflecting on media messages and understand the power of information and communication. Therefore, these consumers will be more careful when disclosing their personal data thoughtlessly.
Effective fake news detection will reduce defamation of individuals: A common practice to hide the source of the entity spreading fake news is to hijack other individuals’ accounts. The owners of such accounts may be defamed, e.g. by the spread of fake news. At the same time, as it is common for fake news to be spread with the goal to harm individuals or groups of people, for example in political campaigns, technology for fake news detection would limit this kind of defamation.

Negative foreseen impacts on data protection:

Lack of transparency and need fora legal basis: fake news detection algorithms combine different sets of information with each other among which there is also personal data (e.g. related to the source of the messages). Currently, it is not transparent to individuals what personal data is processed in the context of fake news detection, nor what the legal basis is for this processing. As a result, individuals cannot effectively exercise their rights to access, correction and deletion of their personal data.
Accuracy of the algorithms: While technology can help to assess large numbers of fake news instances, its effectiveness is bound by the error rates of the applied algorithms (sometimes a set of different algorithms is applied sequentially). Given the contextual complexity, as well as cultural differences and the challenges of artificial intelligence, fake news detection may lead to biased results. This could lead to true information being blocked or categories of users/opinions that marginalised.
Increase of automated decision-making: Fake news detection technology consists mainly of automated detection tools for which effective human oversight should be applied. Often, human resources devoted to oversight are not sufficient and data subjects may not be able to exercise their rights for human oversight and/or access to their personal data.

Our three picks of suggested readings:

C. Wardle, H. Derakhshan, Information disorder: toward an interdisciplinary for research and policy making, Council of Europe report DGI (2019)09, September 2017.
European Parliamentary Research Service, Automated tackling of disinformation, 2019.
The social observatory for disinformation and media analysis (SOMA), https://www.disinfobservatory.org/

EDPS related works:

Opinion 3/2018 on online manipulation and personal data, March 2018.

Fake News Detection

Positive foreseen impacts on data protection:

Negative foreseen impacts on data protection:

Our three picks of suggested readings:

EDPS related works:

Trainees Conference: From Cradle to Cloud: Surveillance and Digitalisation around Childhood

TechDispatch - Federated Learning

High-Level Debate on Competition, Innovation and Data Protection