Volume 59 Issue 04 May 2026
Happening Now

A Contract of Trust: Artificial Intelligence Usage for SIAM Journal Submissions

Irresponsible usage of artificial intelligence for scientific literature is growing, often leading to fabricated references and erroneous results. Image courtesy of SIAM. 
Irresponsible usage of artificial intelligence for scientific literature is growing, often leading to fabricated references and erroneous results. Image courtesy of SIAM. 

The scholarly ecosystem is based on a contract of trust between authors, editors, referees, and readers of scientific publications. Although it may appear as though the diligent SIAM editors and referees are checking every detail of every submission, they are not and have never been expected to do so. Here, for instance, is some guidance that the SIAM Journal on Mathematics of Data Science gives to its referees:

A referee is expected to read the paper with sufficient care to be confident that it is mathematically sound. Nevertheless, it is not necessary to check every detail. The author has final responsibility for the content.

It is impossible for a referee to check every line of every proof, every algorithm, every experiment, and every reference. They spend just a few hours evaluating work that is not their own and potentially outside their exact specialty; in contrast, authors are experts in their own work and have spent months or years developing it. Consequently, significant trust is placed in SIAM authors for the content of their submissions. There is an expectation that they will meet or exceed the usual standards of responsible scholarship, and, in general, they do!

However, there are rising concerns as researchers more frequently incorporate artificial intelligence (AI) tools into their work. Although there is no doubt that AI can be beneficial to scientific research endeavors, AI misuse can result in poor scholarship which inevitably hurts other researchers and negatively impacts the profession as a whole. Not only does it demand more time from referees and editors, but any mistakes that make it past review (as they invariably do) proliferate confusion. It is hard enough to read a mathematical paper, but it is even harder when there are mistakes. This leads to wasted time and effort for readers and can ultimately damage the reputation of the field [6, 9, 12, 13].

A particular issue is the rise of fabricated references. These have been documented both in pulished scientific works [6,9], and observed in submissions to SIAM journals. A recent report [2] elucidated the hazards of hallucinated citations: 

The appearance of AI-generated hallucinated citations in peer reviewed literature represents a fundamental challenge to the integrity of scientific discourse. Citations serve as the evidentiary foundation of scholarly work, establishing what prior research has demonstrated, enabling reproducibility, and situating new contributions within existing knowledge. When citations are fabricated, these epistemic functions collapse. Readers cannot verify claims attributed to nonexistent sources. Future researchers waste time searching for papers that were never published. The citation graph that structures scientific knowledge becomes contaminated with false linkages.

Fabricated references are just one way that AI is eroding the trust in scientific literature, and it is incumbent upon all of us to take responsibility for maintaining the integrity of our work and our field.

Author Usage of AI Tools

SIAM authors (all of whom must be human) may use AI tools, provided that they properly disclose their usage per the SIAM Publications AI Policy. AI tools can be extremely useful for a variety of tasks; SIAM authors have successfully used AI tools for finding related literature, developing proofs and algorithms, coding, running experiments, and getting feedback on drafts. AI tools can be a powerful aid to research and scholarship, hence we do not discourage their use per se, but we insist upon responsible use and appropriate disclosures.

AI and Poor Scholarship

There are several common failure modes that researchers should be aware of when utilizing AI. AI tools can generate impressive text, code, and even mathematical proofs, but because their outputs generally appear highly plausible—thanks to their training on vast amounts of data—underlying and significant errors may be difficult to detect. Therefore, we encourage both users of AI tools and those who review their work, such as research supervisors, to be aware of these failure modes and to check for them carefully. Here, we mention a few of the most common failure modes, but this list is not exhaustive.

  • Fabricated references: This may mean references that do not exist, have incorrect titles or author lists, and so on. It can also mean citing a result that is not actually in the cited paper [2, 3].
  • Unattributed appropriation of scientific ideas: AI tools contain vast troves of unattributed information, and they can easily produce ideas, including key mathematical and algorithmic innovations, that are not properly credited to original sources [1, 8].
  • Mathematical and coding errors: AI tools can produce mathematical content and code that appears correct but contains subtle errors that can fool all but the most knowledgeable experts. This is difficult to precisely study, but see [7] for discussion.
  • Inaccurate figures: AI tools may generate artificial data or improperly graph real data, leading to figures that are inaccurate or misleading [4, 10, 11].

Referee Usage of AI Tools

Referees play a crucial role in our scholarly community; their critical feedback enables editors to identify the most significant scientific contributions to our journals. Additionally, referee feedback can be invaluable to authors by offering them significant ways to improve their work. Referee use of AI tools is another area of growing concern [5].

Because of the confidential nature of referee work, SIAM currently prohibits its referees from using AI tools. SIAM is exploring how AI tools might support referees in the future. We would stress, however, that these tools may only be used to assist them in their work. It might be used for tasks such as replicating computational experiments, filling in steps in a proof, or finding related works. A key ingredient in the scholarly contract is that referee reports are based on the expertise and judgement of the referees themselves.

Penalties for Irresponsible AI Usage

Because we operate on a contract of trust, poor scholarship in any part of a manuscript makes everything suspect. How can we trust the mathematical content if some of the references are fabricated? How do we make sense of the results if the figures are nonsensical? Even if the primary content appears to be correct, poor scholarship in other parts of the manuscript can undermine confidence in the work as a whole.

For this reason, SIAM editors and referees may reject an article for poor scholarship, even if there is no specific technical error. SIAM may impose additional penalties, such as a ban on future submissions to SIAM publications. We highlight two areas that have already resulted in author integrity investigations at SIAM:

  1. Increasing numbers of fabricated references are appearing in SIAM submissions. We expect that we’ve only found the tip of the iceberg, as articles with fraudulent references have thus far only come to light by happenstance, such as a referee finding their own name attached to a reference they didn’t write. Fabricated references are a serious violation of academic integrity. If we discover that submitted content (including work that has already been published) contains fabricated references, the consequences will include banning the authors from submitting to SIAM publications for a minimum of one year.
  2. Substandard submissions are increasing, apparently due to the ease of creating mathematical papers with the help of AI. Substandard submissions consist of papers that fail to clarify their mathematical contributions, are very far outside the journal scope, have limited mathematical substance, have pages of equations without context, contain incoherent or incorrect arguments, or have insufficient or inappropriate references. We certainly understand that even excellent authors have occasional weak papers, and we do not intend to penalize authors for a single substandard submission. However, authors that repeatedly submit low-quality work at a high rate will be subject to a ban from submitting to SIAM publications for a minimum of one year.

Key Takeaways

Responsible use of AI is essential to maintaining the integrity of our journals and our scientific work. We ask that SIAM authors continue to maintain their usual highest standards in their research, extending this to the new domain of research assisted by AI. Further, we encourage conversations on this topic with coauthors and colleagues, sharing best practices and maintaining alertness to potential problematic usage of AI.

For those that want to explore the broader impact that irresponsible use of AI tools has on scientific literature, we recommend reading through the references provided. 

References 
[1] Ananya. (2025). What counts as plagiarism? AI-generated papers pose new risks. Nature. Retrieved from https://www.nature.com/articles/d41586-025-02616-5
[2] Ansari, S. (2026). Compound deception in elite peer review: A failure mode taxonomy of 100 fabricated citations at NeurIPS 2025. Preprint, arXiv:2602.05930. 
[3] Bienz, A., Pearson, C., & de Gonzalo, S.G. (2026). The case of the mysterious citations. Preprint, arXiv:2602.05867. 
[4] Bik, E. (2024). The rat with the big balls and the enormous penis – how Frontiers published a paper with botched AI-generated images. Science Integrity Digest. Retrieved from https://scienceintegritydigest.com/2024/02/15/the-rat-with-the-big-balls-and-enormous-penis-how-frontiers-published-a-paper-with-botched-ai-generated-images/
[5] Chawla, D.S. (2024). Is ChatGPT corrupting peer review? Telltale words hint at AI use. Nature. Retrieved from https://www.nature.com/articles/d41586-024-01051-2
[6] Conroy, G. (2023). Scientific sleuths spot dishonest ChatGPT use in papers. Nature. Retrieved from https://www.nature.com/articles/d41586-023-02477-w
[7] Guo, D., Liu, J., Fan, Z., He, Z., Li, H., Li, Y., Wang, Y., & Fung, Y.R. (2025). Mathematical proof as a litmus test: Revealing failure modes of advanced large reasoning models. Preprint, arXiv:2506.17114.
[8] Gupta, T. & Pruthi, D. (2025). All that glitters is not novel: Plagiarism in AI generated research. In Proceedings of the 63rd annual meeting of the Association for Computational Linguistics (Vol. 1: Long papers) (pp. 25721-25738). Vienna, Austria. 
[9] Jacobs, P. (2025). One-fifth of computer science papers may include AI content. Science, 389(6760). Retrieved from https://www.science.org/content/article/one-fifth-computer-science-papers-may-include-ai-content.
[10] Kwon, D. (2024). AI-generated images threaten science — here’s how researchers hope to spot them. Nature. Retrieved from https://www.nature.com/articles/d41586-024-03542-8
[11] Landymore, F. (2025). GPT-5 launch demo plagued with catastrophically dumb errors. Futurism. Retrieved from https://futurism.com/gpt-5-demo-dumb-errors
[12] Liang, W., Zhang, Y., Wu, Z., Lepp, H., Ji, W., Zhao, X., … Zou, J. (2025). Quantifying large language model usage in scientific papers. Nat. Hum. Behav., 9, 2599-2609. 
[13] Stokel-Walker, C. (2024). AI chatbots have thoroughly infiltrated scientific publishing. Scientific American. Retrieved from https://www.scientificamerican.com/article/chatbots-have-thoroughly-infiltrated-scientific-publishing/.

About the Author