🎓 About Me
I am an undergraduate researcher at Jiangnan University (Project 211 & Double First-Class). My work centres on AI for Science—especially biological sequence modelling, single-cell foundation models, protein/peptide design, and large-language-model reasoning.
I have authored or co-authored 7 papers (6 as first / co-first author, 6 accepted), co-spanning NeurIPS, ICML, PRCV, and ICIC, filed 1 invention patent, and won Gold at iGEM 2025 as the dry-lab lead.
I am actively seeking research opportunities (RA, summer internship, or graduate study) in AI for Life Sciences. Please feel free to reach out.
🌟 Research Vision
Biology speaks in sequences; AI is learning to listen. I am driven by two complementary pursuits:
- Decoding biological sequences. Designing information-theoretic, interpretable foundation models for proteins, peptides, single-cell transcriptomes and beyond.
- Reasoning machines for science. Building LLM-driven agents that fuse symbolic reasoning with deep representation learning to solve concrete scientific design problems.
"We must know — we will know."
📰 News
- 2026 RA-Det (5th author) accepted to ICML 2026 (CCF-A). NEW
- 2026 Tokenization is Mechanism submitted to NeurIPS 2026 (CCF-A) as first author.
- 2026 Four first-author papers accepted as Orals at ICIC 2026: ProtoGene · Extract-Then-Compile · Alignment-Adaptive Fusion · FWMamba-UNet.
- 2025-11 iGEM 2025 Gold Medal awarded in Paris; presented as dry-lab lead.
- 2025 MambaGuard (co-first author) accepted to PRCV 2025 (CCF-C).
- 2025 Invention patent filed: "A Neuro-Symbolic Method & Apparatus for Language-Driven Travel Planning."
- 2025 Awarded Outstanding Student Cadre of Wuxi City.
🏫 Education
Jiangnan University (Project 211 · Double First-Class)
B.Eng. in Digital Media Technology, School of AI & Computer Science
Sep 2023 — Jun 2027 | GPA: 88 / 100
Honours: Jiangnan University Honor Student (至善生), 1st-class Comprehensive Scholarship, Outstanding Student Cadre (校级), Outstanding Student Cadre of Wuxi City (市级).
📄 Publications
1 denotes first author · * denotes co-first author
pMHC binding prediction · information-theoretic token merging · interpretable mechanism · cross-domain biological sequence modelling (pMHC / TCR / DNA / SMILES).
Single-cell foundation model · prototypical contrastive fine-tuning · generalisation across sequencing protocols.
LLM agent · neuro-symbolic reasoning · constraint-satisfaction planning & solving.
Vision-language model · multimodal fusion · medical image segmentation.
Mamba / state-space model · wavelet transform · medical image segmentation.
AI-generated image detection · vision-language model · Mamba.
Universal detection of AI-generated images via robustness-asymmetry.
🧪 Featured Projects
FWMamba-UNet — Frequency-Wavelet Enhanced Mamba UNet
A medical image segmentation network that augments the Mamba state-space backbone with frequency-domain and wavelet-transform branches. The frequency-wavelet enhancement captures cross-scale boundary cues that pure spatial-domain UNets miss, yielding consistent Dice / HD95 gains across multi-organ and pathology datasets while keeping the linear-time complexity of Mamba.
AMP Forge — Antimicrobial Peptide De-Novo Design
A general AMP design platform tackling antibiotic resistance via the human peptide LL-37 expressed in S. cerevisiae. Built the largest curated AMP corpus and a three-stage pipeline: ESM-2 / ProtT5 / Ankh + BiGRU-VAE latent space → Latent Diffusion → Transformer decoder, trained MLE → RL adversarial → diffusion fine-tuning. Supports six generation modes and achieves SOTA on multiple metrics; generated variants outperformed the wild type in wet-lab assays.
ProtoGene — Single-Cell Foundation Model Fine-Tuning
A biology-aware prototypical fine-tuning framework that closes the read-technology gap between scRNA-seq protocols, improving the generalisation of pretrained single-cell foundation models on unseen datasets.
Tokenization-as-Mechanism
Information-asymmetric token merging that turns the tokenizer itself into an interpretable mechanism for biological sequence learning. Tested across pMHC, TCR, DNA, and SMILES with consistent gains and interpretable merging trees.
SHINE — Neuro-Symbolic Travel Planner
Extract-Then-Compile: an LLM agent that lifts natural-language constraints into a symbolic program, then solves it reliably via classical search. Achieves a strong pass-rate on the ChinaTravel benchmark and ships as a filed invention patent.
📬 Contact
+86 176-1462-8870
Wuxi, Jiangsu, China
I am open to RA / internship / graduate-study collaborations in AI for Life Sciences.