SciToolAgent: a knowledge-graph-driven scientific agent for multitool integration

Birhane, A., Kasirzadeh, A., Leslie, D. & Wachter, S. Science in the age of large language models. Nat. Rev. Phys. 5, 277–280 (2023).

Google Scholar

Schick, T. et al. Toolformer: language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36, 68539–68551 (2023).

Google Scholar

Yang, R. et al. GPT4Tools: teaching large language model to use tools via self-instruction. Adv. Neural Inf. Process. Syst. 36, 71995–72007 (2024).

Google Scholar

Guo, T. et al. What can large language models do in chemistry? A comprehensive benchmark on eight tasks. Adv. Neural Inf. Process. Syst. 36, 59662–59688 (2023).

Google Scholar

Zhao, W. X. et al. A survey of large language models. Preprint at (2023).

Min, B. et al. Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surveys 56, 1–40 (2023).

Google Scholar

Wang, L. et al. A survey on large language model based autonomous agents. Front. Comput. Sci. 18, 186345 (2024).

Google Scholar

Ramos, M. C., Collison, C. J. & White, A. D. A review of large language models and autonomous agents in chemistry. Chem. Sci. 16, 2514–2572 (2025).

Google Scholar

Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).

Google Scholar

Janakarajan, N., Erdmann, T., Swaminathan, S., Laino, T. & Born, J. Language models in molecular discovery. In Drug Development Supported by Informatics (eds Satoh, H. et al.) 121–141 (Springer, 2024).

Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).

Google Scholar

McNaughton, A. D. et al. CACTUS: chemistry agent connecting tool usage to science. ACS Omega 9, 46563–46573 (2024).

Google Scholar

Jin, Q., Yang, Y., Chen, Q. & Lu, Z. GeneGPT: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics 40, btae075 (2024).

Google Scholar

Huang, K. et al. CRISPR-GPT: an LLM agent for automated design of gene-editing experiments. Preprint at (2024).

Liu, H. & Wang, H. GenoTEX: a benchmark for evaluating LLM-based exploration of gene expression data in alignment with bioinformaticians. Preprint at (2024).

Ghafarollahi, A. & Buehler, M. J. ProtAgents: protein discovery via large language model multi-agent collaborations combining physics and machine learning. Digital Discovery 3, 1389–1409 (2024).

Google Scholar

Jia, S., Zhang, C. & Fung, V. LLMatDesign: autonomous materials discovery with large language models. Preprint at (2024).

Kang, Y. & Kim, J. ChatMOF: an artificial intelligence system for predicting and generating metal–organic frameworks using large language models. Nat. Commun. 15, 4705 (2024).

Google Scholar

Wu, H. et al. ChatEDA: a large language model powered autonomous agent for EDA. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 43, 3184–3197 (2024).

Google Scholar

Ni, B. & Buehler, M. J. MechAgents: large language model multi-agent collaborations can solve mechanics problems, generate new data, and integrate knowledge. Extreme Mech. Lett. 67, 102131 (2024).

Google Scholar

Yao, S. et al. ReAct: synergizing reasoning and acting in language models. In International Conference on Learning Representations (2023).

He, J. et al. Control risk for potential misuse of artificial intelligence in science. Preprint at (2023).

Liu, X. et al. ToolNet: connecting large language models with massive tools via tool graph. Preprint at (2024).

Hao, S., Liu, T., Wang, Z. & Hu, Z. ToolkenGPT: augmenting frozen language models with massive tools via tool embeddings. Adv. Neural Inf. Process. Syst. 36, 45870–45894 (2024).

Google Scholar

Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. & Yao, S. Reflexion: language agents with verbal reinforcement learning. Adv. Neural Inf. Process. Syst. 36, 8634–8652 (2024).

Google Scholar

Ingraham, J. B. et al. Illuminating protein space with a programmable generative model. Nature 623, 1070–1078 (2023).

Google Scholar

Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

MathSciNet

Google Scholar

Atilgan, A. R. et al. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J. 80, 505–515 (2001).

Google Scholar

Bakan, A., Meireles, L. M. & Bahar, I. Prody: protein dynamics inferred from theory and experiments. Bioinformatics 27, 1575–1577 (2011).

Google Scholar

Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422 (2009).

Google Scholar

Schwaller, P. et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Central Science 5, 1572–1583 (2019).

Google Scholar

Pei, Q. et al. BioT5+: towards generalized biological understanding with IUPAC integration and multi-task tuning. In Findings of the Association for Computational Linguistics: ACL 2024, 1216–1240 (Association for Computational Linguistics, 2024).

Papadatos, G. et al. SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Res. 44, D1220–D1228 (2016).

Google Scholar

Kim, S. et al. Pubchem substance and compound databases. Nucleic Acids Res. 44, D1202–D1213 (2016).

Google Scholar

Bobbitt, N. S. et al. MOFX-DB: an online database of computational adsorption data for nanoporous materials. J. Chem. Eng. Data 68, 483–498 (2023).

Google Scholar

Nandy, A. et al. Mofsimplify, machine learning models with extracted stability data of three thousand metal–organic frameworks. Sci. Data 9, 74 (2022).

Google Scholar

Dubbeldam, D., Calero, S., Ellis, D. E. & Snurr, R. Q. RASPA: molecular simulation software for adsorption and diffusion in flexible nanoporous materials. Molecular Simulation 42, 81–101 (2016).

Google Scholar

BLAST: basic local alignment search tool. NIH (2024).

RDKit: open-source cheminformatics software. RDKit (2024).

Bajusz, D., Rácz, A. & Héberger, K. Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? J. Cheminformatics 7, 1–13 (2015).

Google Scholar

Smith, T. F. et al. Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981).

Google Scholar

Hu, E. J. et al. LoRA: low-rank adaptation of large language models. In International Conference on Learning Representations (2022).

Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

Google Scholar

Yu, J. Dataset for the paper “SciToolAgent: A knowledge graph-driven scientific agent for multi-tool integration”. Zenodo (2025).

Yu, J. & Ding, K. HICAI-ZJU/SciToolAgent: V1.0.1. Zenodo (2025).