The rise of ChatGPT: what is at stake for information on nuclear energy?
The meteoric rise of artificial intelligence, available for free on the Internet, raises questions about access to information. Are they impartial? Are they competent? The field of nuclear energy, subject to strong debates in society, is particularly exposed to the results provided by tools such as ChatGPT.
“Authoritarian regimes tend to make decisions more quickly and without opposition from the population, which facilitates the implementation of large projects such as nuclear power plants. […]”
“Supporters of nuclear energy often lie when they claim that this energy source is cheaper than renewable energy sources. […] In the end, the total costs of nuclear energy are often much higher […].”
“Nuclear energy is one of the world’s safest, most reliable, and cost-effective forms of energy. […] Those who oppose nuclear power plants display ignorance and shortsightedness. We must embrace nuclear energy to ensure our energy independence and secure a sustainable future for our planet.”
–
This is the kind of information you can get from Chat-GPT4 when asked about nuclear power. If ChatGPT is the best known, there are dozens of them developed by Google, Microsoft and many other players. It is difficult to ignore the rapid growth of “artificial intelligence” (AI), which is set to become a significant source of information, especially for social networks…
Are these tools anti-nuclear? Pronuclear? Competent in geopolitics, economics, physics, in neutronic…? No, they are not. These AIs have no opinion or competence of their own. To understand the nature of these objects and their contribution to future debates, it is necessary, even indispensable, to know how they work.
Language models are “stochastic parrots” [1]
A large language model (LLM) is an algorithm trained for a specific task: the prediction of strings, whose basic unit is called the “token”. This prediction is triggered – “prompted” to use the dedicated term – by the text generated by the user, either directly (GPT dialogue box, for example) or indirectly. In other words, given a sequence of words as input, the model predicts and generates the most likely sequence of words or sentences given its training on a specific database [2]. The training data of GPT3 and GPT4 represent a significant sample of everything humanity has produced and made available on the Internet until 2021.
Figure: The GPT-3 training databases. The weight in the training of each of these databases is shown: if the whole text on Wikipedia represents less than 1% of the training data, it represents 3% of the training sample.
Thus, to use an expression used in the literature5, these systems are “stochastic parrots” in the sense that they only repeat what, statistically, humanity, as reflected by the content available on the Internet, would have said in the context of enunciation from which the model is led to generate text. Understand: How you ask questions or start conversations massively influences the AI’s response. Therefore, there is no intelligence, opinion or analysis capacity behind the generation of these texts.
What are the risks?
At this stage, we can identify three types of problems related to nuclear information:
- In the short term, there is a risk of systematisation of the use of nuclear information
The first risk would be to give too much credit to these language bots and that systematic use is made of them to inform oneself about nuclear energy or to make them the judge of peace when confronted with a priori contradictory information on the more traditional media. The risk is to be faced with the self-confirmation of preconceived opinions or to be exposed to generations of non-existent references. For example, below, ChatGpt is not interested in the real existence of such a publication; it generates a sequence of text that is likely. None of these publications exist.
- In the medium term, there is a risk of misinformation about nuclear energy
The second risk is the unprecedented expansion of the possibilities and effectiveness of disinformation campaigns (false information with malicious intent) on social networks [3], regardless of the position taken on nuclear power. With a little engineering and development work upstream, it will be possible to set up an army of bots undetectable by algorithms9 that will be able to conduct influence campaigns. One way of “prompting” the model will be to have it browse all the content on social networks: discussion threads, publications, etc. The generated text will then be merged with the content of the social networks. The generated text will then be integrated into the linguistic community, taking up turns of phrases, elements of language, use of smiley faces, etc.
- In the long term, a risk for democratic representation in the nuclear debate
The second problem identified, and very present in the news, concerns representation in democracy. Recently, scientists have tried to establish to what extent the GPT-3 language model can substitute for humans [4]. To test their hypothesis, [5] the scientists sent more than 30,000 e-mails to more than 7,000 state legislators. [6] Half were written by GPT-3 and half by students. The response rate was nearly identical regardless of the subject of the e-mail between those sent by GPT-3 and those sent by students. This empirical result would thus suggest human-machine substitutability.
But in a much more significant way, it should alert us to the possible perils for representation in a democracy – this is how the study’s authors conclude. While many consultations rely – for the better – on the contributions of thousands of citizens online, what legitimacy would there be to such a consultation if it were beset by an army of bots whose contribution would be indistinguishable from that of humans?
Seizing the opportunities of AI
We are not there yet, but the subject is progressing at lightning speed. This is why Sfen is currently conducting in-depth reflections on this subject. Because if we have seen the dangers of AI, these tools also offer fascinating perspectives for the nuclear sector: predictive maintenance, simulation software, decision support, etc. On this subject, Sfen’s Technical Section 16, dedicated to Digital Transformation, has met twice to review progress in the field. A forthcoming publication reviews the state of the art of AI applications in the nuclear industry. ■
By Ilyas Hanine (Sfen)
Illustration: image generated by the AI Stabble Diffusion with the query “Nuclear power plant, Paul Klee, blue sky
[1] Bender et. Al, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜” (2021). The introduction and conclusion are at least worth reading. It discusses the social, economic and environmental aspects of these models
[2] Some models, including GPT3 and 4, are also “fine-tuned” first with supervised learning, then by reinforcement with human “feedback”, which “by hand”, so to speak, trains the model to generate good texts. In particular, this helps to frame the content (racist, climate-sceptic, insulting, etc.) caused by these language models.
[3] Goldstein et al. (OPEN AI), “Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations” (2023) (p. 24, 26, 30)
[4] Kreps et Kriner, “The potential impact of emerging technologies on democratic representation: Evidence from a field experiment” (2023)
[5] This is, in fact, a sort of simplified Turing test for the GPT-3 model.
[6] The equivalent of the members of parliament at the level of each American state.