AMP & MIC Predictor

Enter an amino acid sequence (10–100 residues) or upload a FASTA file. The tool will classify the peptide and — if antimicrobial — predict Minimum Inhibitory Concentrations against selected bacteria.

1
Input
Enter sequence
2
Analyse
Submit to model
3
Results
View prediction
Enter a valid sequence to begin.
Sequence Input
AMP Long peptide — 99 aa
MEKAALIFIGLLLFSTCTQILAQSCNNDSDCTNLKCATK…
AMP Short peptide — 41 aa
SLQGGAPNFPQPSQQNGGWQVSPDLGRDDKGNTRGQIEIQ
Non-AMP Long peptide — 61 aa
MKSLLPLAILAALAVAALCYESHESMESYEVSPFTTRR…
Non-AMP Short peptide — 41 aa
MKPLKQKVSITLDEDVIKNLKTLAEECDRSLSQYINLILK
Standard AA characters only: ACDEFGHIKLMNPQRSTVWY 0 / 100 aa
Amino Acid Property Composition
Hydrophobic (AVILMFW)
Cationic (KRH)
Anionic (DE)
Polar (STNQ)
Special (CGP)
Aromatic (YW)
 Sequence Property Viewer — hover a residue for details
Select Bacteria for MIC Prediction
Only available when classified as AMP
MIC prediction is performed only for sequences classified as Antimicrobial Peptides (AMPs). Select one or more target organisms below.
Submit Analysis
Est. processing time: ~30 seconds
Ctrl+Enter to submit
 Results Dashboard
Classification
Awaiting input…
Confidence Score
Model confidence
Predicted MIC Values (µM) — selected bacteria
MIC results will appear here after analysis. Select bacteria above before submitting.
Detailed Report
The downloadable PDF report will appear here.
Classifier Performance Metrics

Metrics obtained on a held-out test set, never seen during model training or hyperparameter tuning.

MetricValueDescription
Accuracy0.963Proportion of correctly classified sequences (AMP & Non-AMP).
Precision0.964Of all predicted AMPs, the fraction that are truly AMPs (fewer false positives).
Recall0.963Of all true AMPs, the fraction correctly identified (fewer false negatives).
F1-Score0.963Harmonic mean of precision and recall — balanced performance indicator.
Validation Accuracy0.968Accuracy on the validation split used during model development.
MIC Regressor Metrics (per bacterium)

Separate regression models predict MIC for each organism. Performance evaluated via MSE (log-scale), R², Pearson correlation, and Kendall's tau.

BacteriumMSE (log)MSEMAEPearsonKendall
E. coli0.04810.48640.70230.13750.83940.6725
P. aeruginosa0.05170.52270.68640.12330.83110.6922
S. aureus0.05170.49880.68280.14720.82780.6536
K. pneumoniae0.05380.42920.74160.14790.86930.7194
Model Interpretability — SHAP & LIME

SHAP (SHapley Additive exPlanations) quantifies each feature's global contribution across all predictions. LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions and appears in the downloadable PDF report.

Global SHAP Feature Importance
Global SHAP Feature Importance Plot
Interpretation

The model's AMP predictions are driven by a combination of sequence-based, structural, and biophysical descriptors:

A. Sequence-Based Features

APAAC13 & APAAC5 — Amphiphilic Pseudo-Amino Acid Composition
  • Encode hydrophobicity, charge, and side-chain properties. Higher values positively influence AMP classification, reflecting the amphiphilic nature essential for membrane disruption.
Amino Acid Composition (M, C)
  • Methionine (M): Associated with structural stability; positive SHAP impact.
  • Cysteine (C): Forms disulfide bonds stabilising defensin-like structures; high content positively predicts AMP activity.

B. Structural & Biophysical Features

  • HydrophobicityD3001: Critical feature — more hydrophobic peptides strongly favoured, consistent with membrane insertion mechanisms.
  • PolarityD1001: Balances hydrophobicity to maintain membrane solubility and interaction.
  • SolventAccessibilityD3001: Exposed residues positively contribute, facilitating membrane contact.
  • ChargeD2001: Net positive charge (cationic AMPs) strongly predicts activity against negatively-charged bacterial membranes.
  • PolarizabilityD3001 & NormalizedVDWVD3001: Influence membrane penetration and steric fit.

C. Geary Autocorrelation Descriptors

  • GearyAuto_Hydrophobicity30: Clustering of hydrophobic residues at lag 30 — reflects amphipathic helix formation.
  • GearyAuto_Steric30 & 29: Backbone flexibility at spatial lags 29–30; moderate flexibility aids interaction with diverse membrane compositions.
  • GearyAuto_ResidueASA30: Consistent pattern of residue surface exposure at lag 30 improves bacterial targeting.
Sample Test Sequences
#DescriptionExpectedSequence (truncated)
1Long (99 aa)P-AMPMEKAALIFIGLLLFSTCTQIL…
2Long (99 aa)Non-AMPMKSLLPLAILAALAVAALCYE…
3Short (51 aa)P-AMPSLQGGAPNFPQPSQQNGGRWQ…
4Short (50 aa)Non-AMPMKPLKQKVSITLDEDVIKNL…
5Invalid charsRejectedMEKAALIFIG(XX)…
About EPIC-AMP

This web application provides a streamlined interface for classifying amino acid sequences as Antimicrobial Peptides (AMPs) or Non-AMPs, and for predicting the Minimum Inhibitory Concentration (MIC) of potential AMPs against clinically relevant bacteria. AMPs are key components of the innate immune system and represent a promising avenue for combating drug-resistant pathogens.

Model Selection Criteria

Over 225 combinations of feature extraction and selection methods were evaluated across four machine learning architectures for each target organism. The final models were selected based on:

  • High Accuracy, F1-score, and Validation Accuracy on a held-out test set.
  • Robustness to sequence length variation within the 10–100 aa range.
  • Generalisation across diverse AMP families and taxonomic origins.
  • Regression capability assessed by MSE, R², Pearson correlation, and Kendall's tau.
Intended Use

This tool is intended for research and educational purposes. It provides computational predictions to guide experimental work but does not replace laboratory validation. Predictions should be interpreted in the context of the reported model metrics.

Our Team
Ali Abdalhalim
Ali Abdalhalim
Computational Biologist
Ahmed Amr
Ahmed Amr
Computational Biologist
Prof. Eman Badr
Prof. Eman Badr
Full Professor & Director of BCBU
Acknowledgements
  • Bioinformatics and Computational Biology Unit (BCBU), Zewail City
  • The Centre for Genomics, Zewail City
Contact

For questions, collaboration inquiries, or feedback: epicamp.sup@gmail.com

How to Use EPIC-AMP
  • 1

    Prepare your sequence

    Ensure your peptide sequence uses only standard amino acid single-letter codes (ACDEFGHIKLMNPQRSTVWY). Length must be between 10 and 100 residues. For FASTA format, ensure the file has a > header line followed by the sequence.

  • 2

    Enter or upload

    Type/paste directly into the text area, or use the file picker to upload a .fasta, .fa, or .fna file. As you type, the Sequence Property Viewer will colour each residue by its biophysical properties and the composition bar will update in real-time.

  • 3

    Select target bacteria (optional)

    If you want MIC predictions, tick one or more bacteria in the selection panel. These checkboxes activate automatically once a valid sequence is entered. MIC prediction only runs if the peptide is classified as an AMP.

  • 4

    Submit the analysis

    Enter your email address and click Submit Analysis. The button will display elapsed processing time. Analysis typically completes within ~30 seconds.

  • 5

    Interpret the Results Dashboard

    The Classification panel shows AMP or Non-AMP. The Confidence Gauge displays model certainty. The MIC Chart shows bar-chart predictions (µM) for selected organisms. Download the full PDF report including LIME explanation and global SHAP plot.

  • 6

    Clear and repeat

    Click Clear to reset all fields and results before analysing a new sequence.

Troubleshooting
IssueLikely CauseSolution
Invalid characters errorNon-standard AA characters (B, J, O, U, X, Z or symbols)Remove or replace with valid residues
Length out of rangeSequence <10 or >100 charactersTrim or extend to within 10–100 aa
FASTA parse errorMalformed FASTA fileEnsure file starts with >header line and contains only the sequence on subsequent lines
Prediction timeoutHuggingFace Space may be cold-startingWait ~60s and retry; Space auto-resumes