[Selected publications] [Research summary in one slide] [Google Scholar]

Journal(*equal contribution, #corresponding)

  1. A scalable sparse neural network framework for rare cell type annotation of single-cell transcriptome data.
    Y Cheng, X Fan, J Zhang, Y Li#. Communications Biology, 2023. [Full text]
  2. AcrNET: Predicting anti-CRISPR with Deep Learning.
    Y Li, Y Wei, S Xu, Q Tan, L Zong, J Wang, Y Wang, J Chen, L Hong, Y Li#. Bioinformatics, 2023. [Full text]
  3. Con-AAE: Contrastive Cycle Adversarial Autoencoders for Single-cell Multi-omics Alignment and Integration.
    X Wang, Z Hu, T Yu, Y Wang, R Wang, Y Wei, J Shu, J Ma, Y Li#. Bioinformatics, 2023. [Full text]
  4. The High-dimensional Space of Human Diseases Built from Diagnosis Records and Mapped to Genetic Loci.
    G Jia*, Y Li*, X Zhong, K Wang, M Pividori, R Alomairy, A Esposito, H Ltaief, C Terao, M Akiyama , K Matsuda, D Keyes, H Im, T Gojobori, Y Kamatani, M Kubo, N Cox, X Gao#, A Rzhetsky#. Nature Computational Science, 2023. [Full text]
  5. Deep autoencoder for interpretable tissue-adaptive deconvolution and cell-type-specific gene analysis.
    Y Chen*, Y Wang*, Y Chen, Y Cheng, Y Wei, Y Li, J Wang, Y Wei, TF Chan#, Y Li#. Nature Communications, 2022. [Full text]
  6. The highly conserved RNA-binding specificity of nucleocapsid protein facilitates the identification of drugs with broad anti-coronavirus activity.
    S Fan, W Sun, L Fan, N Wu, W Sun, H Ma, S Chen, Z Li, Y Li, J Zhang, J Yan. Computational and Structural Biotechnology Journal, 2022.
  7. Self-supervised contrastive learning for integrative single cell RNA-seq data analysis.
    W Han*, Y Cheng*, J Chen*, H Zhong, Z Hu, S Chen, L Zong, L Hong, TF Chan, I King, X Gao#, Y Li#. Briefing in Bioinformatics, 2022. [Full text]
  8. Deep learning identifies and quantifies recombination hotspot determinants.
    Y Li*,#, S Chen*, T Rapakoulia, H Kuwahara, KY Yip, X Gao#. Bioinformatics, 2022. [Full text]
  9. Protein-RNA interaction prediction with deep learning: Structure matters.
    J Wei*, S Chen*, L Zong*, X Gao#, Y Li#. Briefing in Bioinformatics, 2022. [Full text]
  10. DeepCellState: an autoencoder-based framework for predicting cell type specific transcriptional states induced by drug treatment.
    R Umarov, Y Li, E Arner. PLOS Computational Biology, 2021. [Full text]
  11. NERO: A Biomedical Named-entity (Recognition) Ontology with a Large, Annotated Corpus Reveals Meaningful Associations Through Text Embedding.
    K Wang, R Stevens, H Alachram,Y Li, L Soldatova, R King, S Ananiadou, M Li, F Christopoulou, J Ambite, S Garg, U Hermjakob, D Marcu, E Sheng, T Beibbarth, E Wingender, A Galstyan, X Gao, B Chambers, B Khomtchouk, J Evans, A Schoene, W Pan, J Mathew, A Rzhetsky. npj Systems Biology and Applications, 2021. [Full text]
  12. ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation.
    R Umarov, Y Li, T Arakawa, S Takizawa, X Gao, E Arner. PLOS Computational Biology, 2021. [Full text]
  13. Structural and functional studies of the pyroptosis-related human Pannexin1 channel.
    S. Zhang, B. Yuan, J. Lam, J. Zhou, X. Zhou, G. Mandujano, X. Tian, Y. Liu, R. Han, Y Li, X. Gao, M. Li, and M. Yang. Cell Discovery, 2021.
  14. Lunar Features Detection for Energy Discovery via Deep Learning.
    S Chen*, Y Li*,#, T Zhang, X Zhu, S Sun#, X Gao#. Applied Energy, 2021.
  15. Disease Gene Prioritization with Privileged Information and Heteroscedastic Gaussian Dropout.
    J Shu, Y Li, S Wang, J Ma. Bioinformatics, 2021.
  16. HMD-ARG: hierarchical multi-task deep learning for annotating antibiotic resistance genes.
    Y Li*, Z Xu*, W Han*, H Cao, R Umarov, A Yan, M Fan, H Chen, L Li, P Ho, X Gao. Microbiome, 2021. [Full text]
  17. DeepSimulator1.5: a more powerful, quicker and lighter simulator for Nanopore sequencing.
    Y Li*, S Wang*, C Bi, Z Qiu, M Li, X Gao. Bioinformatics, 2020. [Code] [PDF]
  18. A Self-adaptive Deep Learning Algorithm for Accelerating Multi-component Flash Calculation.
    T Zhang, Y Li, Y Li, S Sun, and X Gao. Computer Methods in Applied Mechanics and Engineering, 2020.
  19. Modern Deep Learning in Bioinformatics.
    H Li*, S Tian*, Y Li*, R Tan, Y Pan, C Huang, Y Xu, and X Gao. Journal of Molecular Cell Biology, 2020.
  20. DeeReCT-APA: prediction of alternative polyadenylation site usage through deep learning.
    Z Li, Y Li, B Zhang, Y Li, Y Long, X Zou, M Zhang, Y Hu, W Chen, X Gao. Genomics, Proteomics & Bioinformatics (GPB), 2020.
  21. Long-read Individual-molecule Sequencing Reveals CRISPR-induced Genetic Heterogeneity in Human ESCs.
    C Bi, L Wang, B Yuan, X Zhou, Y Li, S Wang, Y Pang, X Gao, Y Huang, M Li. Genome Biology, 2020.
  22. A deep learning framework to predict binding preference of RNA constituents on protein surface.
    J Lam*, Y Li*, L Zhu, R Umarov, H Jiang, A Heliou, F Sheong, T Liu, Y Long, Y Li, L Fang, R Altman, W Chen, X Huang, X Gao. Nature Communications, 2019.
    [KAUST news] [Chinese introduction] [PDF] [Code] [Server]
  23. Estimating heritability and genetic correlations from large health datasets in the absence of genetic data.
    G Jia, Y Li, H Zhang, I Chattopadhyay, A Jensen, D Blair, L Davis, P Robinson, T Dahlén, S Brunak, M Benson, G Edgren, N Cox, X Gao, A Rzhetsky. Nature Communications, 2019. [PDF]
    [UChicago news] [Chinese introduction]
  24. Two symmetric Arginine residues play distinct roles in Thermus thermophilus Argonaute DNA guide strand-mediated DNA target cleavage.
    J Lei, G Sheng, P Cheung, S Wang, Y Li, X Gao, Y Zhang, Y Wang, X Huang. Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2019.
  25. Accelerating Flash Calculation through Deep Learning Methods.
    Y Li, T Zhang, S Sun, X Gao. Journal of Computational Physics, 2019. [PDF]
  26. Deep learning in bioinformatics: introduction, application, and perspective in big data era.
    Y Li, C Huang, L Ding, Z Li, Y Pan, X Gao. Methods, 2019. [PDF] [Code]
    Cover article of the Methods issue: Deep Learning in Bioinformatics
    Highly cited paper
  27. mlDEEPre: Multi-functional enzyme function prediction with hierarchical multi-label deep learning.
    Z Zou, S Tian, X Gao, Y Li#. Frontiers in Genetics, 2019. [PDF] [Server]
  28. Promoter analysis and prediction in the human genome using sequence-based deep learning models.
    R Umarov, H Kuwahara, Y Li, X Gao, V Solovyev. Bioinformatics, 2019. [PDF] [Code]
  29. H-NS uses an autoinhibitory conformational switch to achieve environment-controlled gene silencing.
    U Hameed, C Liao, A Radhakrishnan, F Huser, S Aljedani, X Zhao, A Momin, F Melo, X Guo, C Brooks, Y Li, X Cui, X Gao, J Ladury, L Jaremko, M Jaremko, J Li, S, Arold. Nucleic Acids Research (NAR), 2018.
  30. DeeReCT-PolyA: a robust and generic deep learning method for PAS identification.
    Z Xia, Y Li, B Zhang, Z Li, Y Hu, W Chen, X Gao. Bioinformatics, 2018. [PDF] [Code]
  31. DeepSimulator: a deep simulator for nanopore sequencing.
    Y Li, R Han, C Bi, M Li, S Wang, X Gao. Bioinformatics, 2018. [PDF] [Code]
  32. DLBI: Deep learning guided Bayesian inference for structure reconstruction of super-resolution fluorescence microscopy.
    Y Li, F Xu, F Zhang, P Xu, M Fan, L Li, X Gao, R Han. Bioinformatics, 2018. [PDF] [Code]
  33. PredMP: a web server for de novo prediction and visualization of membrane proteins.
    S Wang, S Fei, Z Wang, Y Li, J Xu, F Zhao, X Gao. Bioinformatics, 2018. [PDF] [Server]
  34. An accurate and rapid continuous wavelet dynamic time warping algorithm for end-to-end mapping in ultra-long nanopore sequencing.
    R Han, Y Li, X Gao, S Wang. Bioinformatics, 2018. [PDF] [Code]
  35. DES-Mutation: System for Exploring Links of Mutations and Diseases.
    V Kordopati, A Salhi, R Razali, A Radovanovic, F Tifratene, M Uludag, Y Li, A Bokhari, A AlSaieedi, A Raies, C Neste, M Essack, V Bajic. Scientific Reports, 2018. [PDF] [Server]
  36. AuTom-dualx: a toolkit for fully automatic fiducial marker-based alignment of dual-axis tilt series with simultaneous reconstruction.
    R Han, X Wan, L Li, A Lawrence, P Yang, Y Li, S Wang, F Sun, Z Liu, X Gao, F Zhang. Bioinformatics, 2018.
  37. DEEPre: sequence-based enzyme EC number prediction by deep learning.
    Y Li, S Wang, R Umarov, B Xie, M Fan, L Li, X Gao. Bioinformatics, 2017. [PDF] [Server]
  38. Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape.
    H Dai, R Umarov, H Kuwahara, Y Li, L Song, X Gao. Bioinformatics, 2017. [PDF] [Code]
  39. The dynamic multisite interactions between two intrinsically disordered proteins.
    S Wu, D Wang, J Liu, Y Feng, J Weng, Y Li, X Gao, J Liu, W Wang. Angewandte Chemie, 2017.
  40. Reward sensitivity predicts ice cream-related attentional bias assessed by inattentional blindness.
    X Li, Q Tao, Y Fang, C Cheng, Y Hao, J Qi, Y Li, W Zhang, Y Wang, X Zhang. Appetite, 2015.

Conference(*equal contribution, #corresponding)

  1. Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning.
    Z Hu, Q Yu, Y Guo, T Wang, I King, X Gao, L Song, Y Li. The 27th Annual International Conference on Research in Computational Molecular Biology (RECOMB-23)
  2. Understanding Dropout for Graph Neural Networks.
    J Shu, B Xi, Y Li, F Wu, C Kamhoua, J Ma. GraphLearning-2022.
  3. CLMB: deep contrastive learning for robust metagenomic binning.
    P Zhang, Z Jiang, Y Wang, Y Li. The 26th Annual International Conference on Research in Computational Molecular Biology (RECOMB-22). Preprint
  4. Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation.
    Q Wang, J Chen, Y Zhou, Y Li, L Zheng, Z Li, S Cui. Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)
  5. DeepCURATER: Deep Learning for CoURse And Teaching Evaluation and Review.
    Z Hu, B Thumu, Y Qin, T Wong, Y Lu, Z Tao, O Kan, Y Li, I King. 2021 IEEE International Conference on Engineering, Technology & Education (TALE-21)
  6. Disease Gene Prioritization with Privileged Information and Heteroscedastic Gaussian Dropout.
    J Shu, Y Li, S Wang, J Ma. The Twenty-Ninth Conference on Intelligent Systems for Molecular Biology (ISMB-21)
  7. RNA Secondary Structure Prediction By Learning Unrolled Algorithms.
    X Chen*, Y Li*, R Umarov, X Gao, L Song. Eighth International Conference on Learning Representations (ICLR-20),
    Oral(Accpetance rate=48/2599=1.85%)
    [GaTech news] [Chinese news] [Chinese introduction] [Plain explanation]
  8. Learning to Stop While Learning to Predict.
    X Chen, H Dai, Y Li, X Gao, and L Song. Thirty-seventh International Conference on Machine Learning (ICML-20).
  9. Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test.
    L Ding, M Yu, L Liu, F Zhu, Y Liu, Y Li, L Shao. Thirty-third Conference on Neural Information Processing Systems (NeurIPS-19)
  10. Linear Kernel Tests via Empirical Likelihood for High Dimensional Data.
    L Ding, Z Liu, Y Li, S Liao, Y Liu, P Yang, G Yu, L Shao, X Gao. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)
  11. Approximate Kernel Selection with Strong Approximate Consistency.
    L Ding, S Liao, Y Liu, Y Li, P Yang, Y Pan, C Huang, L Shao, X Gao. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)
  12. DLBI: Deep learning guided Bayesian inference for structure reconstruction of super-resolution fluorescence microscopy.
    Y Li*, F Xu*, F Zhang, P Xu, M Fan, L Li, X Gao, R Han. The Twenty-Sixth Conference on Intelligent Systems for Molecular Biology (ISMB-18)
  13. An accurate and rapid continuous wavelet dynamic time warping algorithm for end-to-end mapping in ultra-long nanopore sequencing.
    R Han, Y Li, X Gao, S Wang. The Seventeenth European Conference on Computational Biology (ECCB-18)