Autonomous chemical research with large language models

0
Autonomous chemical research with large language models
  • Brown, T. et al. in Advances in Neural Information Processing Systems Vol. 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).

  • Thoppilan, R. et al. LaMDA: language models for dialog applications. Preprint at (2022).

  • Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at (2023).

  • Hoffmann, J. et al. Training compute-optimal large language models. In Advances in Neural Information Processing Systems 30016–30030 (NeurIPS, 2022).

  • Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Res. 24, 1–113 (2022).

  • Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar 

  • Luo, R. et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform. 23, bbac409 (2022).

    Article 
    PubMed 

    Google Scholar 

  • Irwin, R., Dimitriadis, S., He, J. & Bjerrum, E. J. Chemformer: a pre-trained transformer for computational chemistry. Mach. Learn. Sci. Technol. 3, 015022 (2022).

    Article 
    ADS 

    Google Scholar 

  • Kim, H., Na, J. & Lee, W. B. Generative chemical transformer: neural machine learning of molecular geometric structures from chemical language via attention. J. Chem. Inf. Model. 61, 5804–5814 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Jablonka, K. M., Schwaller, P., Ortega-Guerrero, A. & Smit, B. Leveraging large language models for predictive chemistry. Preprint at (2023).

  • Xu, F. F., Alon, U., Neubig, G. & Hellendoorn, V. J. A systematic evaluation of large language models of code. In Proc. 6th ACM SIGPLAN International Symposium on Machine Programming 1–10 (ACM, 2022).

  • Nijkamp, E. et al. CodeGen: an open large language model for code with multi-turn program synthesis. In Proc. 11th International Conference on Learning Representations (ICLR, 2022).

  • Kaplan, J. et al. Scaling laws for neural language models. Preprint at (2020).

  • OpenAI. GPT-4 Technical Report (OpenAI, 2023).

  • Ziegler, D. M. et al. Fine-tuning language models from human preferences. Preprint at (2019).

  • Ouyang, L. et al. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems 27730–27744 (NeurIPS, 2022).

  • Granda, J. M., Donina, L., Dragone, V., Long, D.-L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Caramelli, D. et al. Discovering new chemistry with an autonomous robotic platform driven by a reactivity-seeking neural network. ACS Cent. Sci. 7, 1821–1830 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Angello, N. H. et al. Closed-loop optimization of general reaction conditions for heteroaryl Suzuki–Miyaura coupling. Science 378, 399–405 (2022).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar 

  • Adamo, A. et al. On-demand continuous-flow production of pharmaceuticals in a compact, reconfigurable system. Science 352, 61–67 (2016).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365, eaax1566 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Burger, B. et al. A mobile robotic chemist. Nature 583, 237–241 (2020).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • Auto-GPT: the heart of the open-source agent ecosystem. GitHub (2023).

  • BabyAGI. GitHub (2023).

  • Chase, H. LangChain. GitHub (2023).

  • Bran, A. M., Cox, S., White, A. D. & Schwaller, P. ChemCrow: augmenting large-language models with chemistry tools. Preprint at (2023).

  • Liu, P. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55, 195 (2021).

  • Bai, Y. et al. Constitutional AI: harmlessness from AI feedback. Preprint at (2022).

  • Falcon LLM. TII (2023).

  • Open LLM Leaderboard. Hugging Face (2023).

  • Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55, 248 (2023).

    Article 

    Google Scholar 

  • Reaxys (2023).

  • SciFinder (2023).

  • Yao, S. et al. ReAct: synergizing reasoning and acting in language models. In Proc.11th International Conference on Learning Representations (ICLR, 2022).

  • Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 24824–24837 (NeurIPS, 2022).

  • Long, J. Large language model guided tree-of-thought. Preprint at (2023).

  • Opentrons Python Protocol API. Opentrons (2023).

  • Tu, Z. et al. Approximate nearest neighbor search and lightweight dense vector reranking in multi-stage retrieval architectures. In Proc. 2020 ACM SIGIR on International Conference on Theory of Information Retrieval 97–100 (ACM, 2020).

  • Lin, J. et al. Pyserini: a python toolkit for reproducible information retrieval research with sparse and dense representations. In Proc. 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2356–2362 (ACM, 2021).

  • Qadrud-Din, J. et al. Transformer based language models for similar text retrieval and ranking. Preprint at (2020).

  • Paper QA. GitHub (2023).

  • Robertson, S. & Zaragoza, H. The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retrieval 3, 333–389 (2009).

    Article 

    Google Scholar 

  • Data Mining. Mining of Massive Datasets (Cambridge Univ., 2011).

  • Johnson, J., Douze, M. & Jegou, H. Billion-scale similarity search with GPUs. IEEE Trans. Big Data 7, 535–547 (2021).

    Article 

    Google Scholar 

  • Vechtomova, O. & Wang, Y. A study of the effect of term proximity on query expansion. J. Inf. Sci. 32, 324–333 (2006).

    Article 

    Google Scholar 

  • Running experiments. Emerald Cloud Lab (2023).

  • Sanchez-Garcia, R. et al. CoPriNet: graph neural networks provide accurate and rapid compound price prediction for molecule prioritisation. Digital Discov. 2, 103–111 (2023).

    Article 

    Google Scholar 

  • Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at (2023).

  • Ramos, M. C., Michtavy, S. S., Porosoff, M. D. & White, A. D. Bayesian optimization of catalysts with in-context learning. Preprint at (2023).

  • Perera, D. et al. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science 359, 429–434 (2018).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • Hickman, R. et al. Atlas: a brain for self-driving laboratories. Preprint at (2023).

  • link

    Leave a Reply

    Your email address will not be published. Required fields are marked *