Zing Forum

Reading

SLR-Magic: Automating Systematic Literature Review Workflows with Large Language Models

A Google Apps Script tool based on large language models that automates the screening and data extraction stages of systematic literature reviews, improving research efficiency and reducing human bias.

系统性文献综述大语言模型学术研究自动化筛选Google Apps Script证据综合研究方法论
Published 2026-04-29 03:36Recent activity 2026-04-29 03:54Estimated read 10 min
SLR-Magic: Automating Systematic Literature Review Workflows with Large Language Models
1

Section 01

[Introduction] SLR-Magic: A Large Language Model-Driven Tool for Automating Systematic Literature Reviews

SLR-Magic is a Google Apps Script tool based on large language models, designed to automate the most tedious screening and data extraction stages in the systematic literature review (SLR) workflow. By integrating into the Google Workspace environment, it addresses the issues of time-consuming, labor-intensive processes and susceptibility to subjective bias in traditional manual SLR workflows, improving research efficiency and reducing human bias. Its core design philosophy is to enhance researchers' capabilities rather than replace them, promoting a new model of human-machine collaborative academic research.

2

Section 02

[Background] Methodological Pain Points of Systematic Literature Reviews

Systematic Literature Review (SLR) is the gold standard for synthesizing evidence in scientific research, requiring strict protocols for comprehensive retrieval, screening, evaluation, and synthesis. However, traditional manual workflows face significant challenges: a typical SLR project involves processing thousands of documents, with multiple researchers independently evaluating them and resolving discrepancies. This is not only time-consuming and labor-intensive but also prone to fatigue, differences in subjective judgment, and cognitive biases, reducing research efficiency and the reliability of conclusions.

3

Section 03

[Methodology] SLR-Magic's Automated Solutions and Core Modules

SLR-Magic focuses on the two most time-consuming stages in the SLR workflow:

Automated Screening: Reads document titles and abstracts, and makes automatic preliminary judgments based on user-defined inclusion and exclusion criteria. Large language models can understand the semantics of the criteria and accurately identify different terminological expressions of similar concepts;

Automated Data Extraction: Automatically extracts key information such as research design, sample characteristics, and main findings from screened documents, replacing the tedious manual work of filling out data extraction forms. The tool seamlessly integrates with Google Workspace, requiring no complex software installation or server configuration.

4

Section 04

[Technology] SLR-Magic's Technical Architecture and Quality Control

Google Apps Script Integration

This platform was chosen because: it natively integrates with Google Sheets (a commonly used document management tool for researchers), requires no additional infrastructure, and uses OAuth authentication to ensure data security.

Multi-Model Support

It is compatible with models like Google Gemini and Alibaba Qwen3. Users can choose based on data privacy, cost, and performance needs (e.g., using local models for sensitive medical data, cloud APIs for general research).

Prompt Engineering and Quality Control

The core challenge is designing reliable prompts: converting complex criteria into model-understandable instructions, handling boundary ambiguity, and designing output formats that facilitate analysis. Quality control includes confidence scoring (low-confidence cases for manual review), random sampling validation, and auditable decision logs to ensure accuracy is not compromised.

5

Section 05

[Value] Advantages of Automating to Eliminate Human Bias

Fatigue and Consistency

AI does not experience fatigue and applies exactly the same evaluation criteria to the first and 1000th documents, avoiding judgment differences caused by attention decay during manual screening.

Cognitive Bias

AI is not affected by human psychological factors such as confirmation bias (tendency to support one's own views), anchoring effect (influenced by early judgments), and halo effect (inappropriate weighting of well-known authors/institutions).

Reproducibility

Automated tools standardize the screening and extraction processes, solving the problem of inconsistent conclusions in traditional SLRs due to differences in execution details, and improving research reproducibility and transparency.

6

Section 06

[Practice] Best Practices for Human-Machine Collaboration

SLR-Magic's design philosophy is to enhance rather than replace researchers. Best practices include:

  • Layered Screening: AI handles clearly relevant/irrelevant documents, leaving boundary cases for human judgment;
  • Iterative Calibration: Calibrate the model with labeled documents before formal operation, adjusting prompts to match human judgments;
  • Dual Validation: Randomly sample AI results for review to assess consistency levels;
  • Transparent Reporting: Clearly state the tool's scope of use and validation methods in the paper.
7

Section 07

[Discussion] Limitations of the Tool and Ethical Considerations

Model Hallucination Risk

Large language models may "hallucinate" non-existent information or misunderstand content. Mitigation measures: require the model to judge only based on the provided text, design validation steps, and maintain human supervision;

Training Data Bias

Models learn from large-scale texts containing human biases and may perform better on documents in certain languages/fields, so results need to be interpreted critically;

Over-reliance Risk

Avoid over-reliance on the tool for convenience and neglect in-depth understanding of original documents. The tool should free up time for researchers to focus on tasks requiring human judgment (e.g., critical evaluation of research quality).

8

Section 08

[Conclusion] Future Directions and Impact on the Research Ecosystem

Future Development Directions

Expanding features: automatic quality assessment (AMSTAR tool), evidence synthesis assistance (identifying research heterogeneity), update monitoring (tracking the impact of new documents);

Impact on the Research Ecosystem

The tool may democratize evidence synthesis, allowing small teams to undertake review work that was previously done by large institutions. However, we need to be alert to the risk of "review proliferation", and the academic community needs to establish quality standards and peer review mechanisms;

Conclusion

SLR-Magic is a useful attempt at AI-assisted academic research, directly addressing SLR pain points and improving efficiency and quality. We need to maintain a clear understanding of its limitations, adhere to the principle of human-machine collaboration, and serve scientific progress. It is an open-source project worth trying.