I-DEEL: Inter-Disciplinary Ecology and Evolution Lab
  • Home
  • People
  • Research
  • Publications
  • Blog
  • Open Science
    • Registrations
    • Registered Reports
    • Published Protocols
    • Preprints
    • EDI
    • Other
  • Opportunities
  • Links

Around meta-analysis (15): emerging Large Language Models (LLM) tools

30/1/2024

0 Comments

 
by Malgorzata (Losia) Lagisz

Systematic reviews (and meta-analyses based on a systematic review of literature) are extremely time-consuming. Anyone who conducted one in a rigorous and robust way can attest to this fact. Not surprisingly, researchers across disciplines have been looking for using computer algorithms and software to automate and accelerate systematic reviews of academic literature.
Picture
free image from pixabay.com
Such efforts have brought some success. Algorithms based on text-mining, artificial intelligence (AI), and more specifically machine-learning approaches, are now integrated in some of the popular software dedicated to literature screening (e.g., Rayyan, Abstracr, ASReview) and even data extraction (e.g., RobotReviewer). Other group of algorithms can suggest relevant evidence based on similarities among documents (e.g., ConnectedPapers, and recommendation systems built into major literature search platforms). However, these tools perform well only in a limited set of scenarios and applications, require extensive and expert initial training investment, and many are not freely accessible. For a recent scoping review of diverse types of automation tools, their applications and drawbacks, see Khalil et al. (2022).
It is tempting to think that a recent development of a new generation of AI models and software has better performance and new capabilities. Especially, Large Language Models (LLMs) are trained on large datasets of written language (think ChatGPT and similar models). They can be operated by using user prompts in conversational language, rather than technical programming languages, which makes them user-friendly. Why not ask them to find relevant studies, highlight or summarise relevant information do the screening for you?
Unfortunately, generic LMMs, like ChatGPT are less than ideal for systematic reviews. They tend to hallucinate (invent evidence), are not accurate, and require expert knowledge and careful set up to provide useful output (Qureshi et al. 2023). Among the many likely reasons for the poor performance, one is their probabilistic nature (making decisions based on probabilities of patterns) and the other one is that ChatGPT models are not trained specifically on academic literature. And they were not rally designed to do work for scientists.

Are there LLMs tailored to academic literature and requirements of researchers? In the last months, such tools were rapidly emerging (Sanderson 2023). Since the y are new, there are no rigorous published assessments of their performance. I tried a few out, but cannot provide any concrete data or recommendations yet. I think we should not aim to fully automate any systematic review steps but, instead, we can use such new tools as an “another reviewer” or an alternative approach that supplements and strengthens our existing workflows.
​
Picture
free image from pixabay.com
Picture
free image from pixabay.com
If you are interested in LLMs that look like potentially useful in systematic reviews workflows (and testing how much you can trust them!), here is my short list of suggestions:
  • Elicit (elicit.com)
  • Scite (scite.ai)
  • Typeset (typeset.io)
  • Consensus (consensus.app)

References:
  • Khalil H, Ameen D, Zarnegar A. Tools to support the automation of systematic reviews: a scoping review. J Clin Epidemiol. 2022 Apr;144:22-42. doi: 10.1016/j.jclinepi.2021.12.005. Epub 2021 Dec 8. PMID: 34896236.
  • Qureshi, R., Shaughnessy, D., Gill, K.A.R. et al. Are ChatGPT and large language models “the answer” to bringing us closer to systematic review automation?. Syst Rev 12, 72 (2023). https://doi.org/10.1186/s13643-023-02243-z)
  • Sanderson K. AI science search engines are exploding in number - are they any good? Nature. 2023 Apr;616(7958):639-640. doi: 10.1038/d41586-023-01273-w. PMID: 37069302.
0 Comments

    Author

    Posts are written by our group members and guests.

    Archives

    October 2025
    September 2025
    August 2025
    July 2025
    June 2025
    May 2025
    April 2025
    March 2025
    January 2025
    December 2024
    November 2024
    October 2024
    September 2024
    August 2024
    July 2024
    June 2024
    May 2024
    April 2024
    March 2024
    February 2024
    January 2024
    December 2023
    November 2023
    October 2023
    September 2023
    August 2023
    July 2023
    June 2023
    May 2023
    April 2023
    March 2023
    February 2023
    January 2023
    December 2022
    November 2022
    October 2022
    September 2022
    August 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    February 2022
    January 2022
    December 2021
    November 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    August 2020
    July 2020
    June 2020
    April 2020
    December 2019
    November 2019
    October 2019
    September 2019
    June 2019
    April 2019
    March 2019
    February 2019
    January 2019
    December 2018
    November 2018
    October 2018
    September 2018
    August 2018
    July 2018
    June 2018
    May 2018
    March 2018
    January 2018
    December 2017
    October 2017
    September 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    January 2017
    October 2016
    August 2016
    July 2016
    June 2016
    May 2016
    March 2016

    Categories

    All

    RSS Feed

HOME
PEOPLE
RESEARCH
PUBLICATIONS
OPEN SCIENCE
OPPORTUNITIES
LINKS
BLOG

Created by Losia Lagisz, last modified on June 24, 2015