AMP & MIC Predictor

Enter an amino acid sequence (10–100 residues) or upload a FASTA file. The tool will classify the peptide and — if antimicrobial — predict Minimum Inhibitory Concentrations against selected bacteria.

Input

Enter sequence

Analyse

Submit to model

Results

View prediction

Enter a valid sequence to begin.

Sequence Input

Standard AA characters only: ACDEFGHIKLMNPQRSTVWY 0 / 100 aa

Amino Acid Property Composition

Hydrophobic (AVILMFW)

Cationic (KRH)

Anionic (DE)

Polar (STNQ)

Special (CGP)

Aromatic (YW)

Sequence Property Viewer — hover a residue for details

Select Bacteria for MIC Prediction

Only available when classified as AMP

MIC prediction is performed only for sequences classified as Antimicrobial Peptides (AMPs). Select one or more target organisms below.

E. coli

P. aeruginosa

S. aureus

K. pneumoniae

Submit Analysis

Email address *

Est. processing time: ~30 seconds

Ctrl+Enter to submit

Results Dashboard

Classification

Awaiting input…

Confidence Score

—

Model confidence

Predicted MIC Values (µM) — selected bacteria

MIC results will appear here after analysis. Select bacteria above before submitting.

Detailed Report

The downloadable PDF report will appear here.

MLP Classifier Performance Metrics

Architecture: Multi-Layer Perceptron (MLP)

Best-trial results from MLP hyperparameter search (trial #49) on AAC + CTD features with RFE selection. Metrics reflect training and validation performance of the final selected model.

Metric	Value	Description
Training Accuracy	0.9941	Proportion of correctly classified sequences on the training set.
Validation Accuracy	0.9682	Accuracy on the held-out validation split used during model development.
Training Loss	0.0412	Binary cross-entropy loss on the training set (lower is better).
Validation Loss	0.1261	Binary cross-entropy loss on the validation set.
Loss Gap	0.0849	Difference between validation and training loss — small gap indicates limited overfitting.

Architecture: 1 hidden layer · 256 units · ReLU activation · L2 regularisation (1e-5) · Dropout 0.3 · Learning rate 1e-3

Regression Performance Metrics (MIC)

Per-organism MIC regression

Separate regression models predict MIC for each organism. Performance evaluated via MSE (log-scale), R², Pearson correlation, and Kendall's tau.

Bacterium	MSE (log)	MSE	R²	MAE	Pearson	Kendall
E. coli	0.0481	0.4864	0.7023	0.1375	0.8394	0.6725
P. aeruginosa	0.0517	0.5227	0.6864	0.1233	0.8311	0.6922
S. aureus	0.0517	0.4988	0.6828	0.1472	0.8278	0.6536
K. pneumoniae	0.0538	0.4292	0.7416	0.1479	0.8693	0.7194

Model Interpretability — SHAP & LIME

SHAP (SHapley Additive exPlanations) quantifies each feature's global contribution across all predictions. LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions and appears in the downloadable PDF report.

Global SHAP Feature Importance

Interpretation

The model's AMP predictions are driven by a combination of sequence-based, structural, and biophysical descriptors:

A. Sequence-Based Features

APAAC13 & APAAC5 — Amphiphilic Pseudo-Amino Acid Composition

Encode hydrophobicity, charge, and side-chain properties. Higher values positively influence AMP classification, reflecting the amphiphilic nature essential for membrane disruption.

Amino Acid Composition (M, C)

Methionine (M): Associated with structural stability; positive SHAP impact.
Cysteine (C): Forms disulfide bonds stabilising defensin-like structures; high content positively predicts AMP activity.

B. Structural & Biophysical Features

HydrophobicityD3001: Critical feature — more hydrophobic peptides strongly favoured, consistent with membrane insertion mechanisms.
PolarityD1001: Balances hydrophobicity to maintain membrane solubility and interaction.
SolventAccessibilityD3001: Exposed residues positively contribute, facilitating membrane contact.
ChargeD2001: Net positive charge (cationic AMPs) strongly predicts activity against negatively-charged bacterial membranes.
PolarizabilityD3001 & NormalizedVDWVD3001: Influence membrane penetration and steric fit.

C. Geary Autocorrelation Descriptors

GearyAuto_Hydrophobicity30: Clustering of hydrophobic residues at lag 30 — reflects amphipathic helix formation.
GearyAuto_Steric30 & 29: Backbone flexibility at spatial lags 29–30; moderate flexibility aids interaction with diverse membrane compositions.
GearyAuto_ResidueASA30: Consistent pattern of residue surface exposure at lag 30 improves bacterial targeting.

Sample Test Sequences

#	Description	Expected	Sequence (truncated)
1	Magainin-2 (23 aa)	P-AMP	GIGKFLHSAKKFGKAFVGEIM…
2	Cecropin A (37 aa)	P-AMP	KWKLFKKIEKVGQNIRDGII…
3	Albumin fragment (59 aa)	Non-AMP	MKWVTFISLLFLFSSAYSRG…
4	His-tag construct (50 aa)	Non-AMP	MGSSHHHHHHSSGLVPRGSH…
5	Invalid chars	Rejected	MEKAALIFIG(XX)…

About EPIC-AMP

This web application provides a streamlined interface for classifying amino acid sequences as Antimicrobial Peptides (AMPs) or Non-AMPs, and for predicting the Minimum Inhibitory Concentration (MIC) of potential AMPs against clinically relevant bacteria. AMPs are key components of the innate immune system and represent a promising avenue for combating drug-resistant pathogens.

Model Selection Criteria

Over 225 combinations of feature extraction and selection methods were evaluated across four machine learning architectures for each target organism. The final classifier is a Multi-Layer Perceptron (MLP) trained on AAC + CTD features with RFE-based feature selection. The final models were selected based on:

High Accuracy, F1-score, and Validation Accuracy on a held-out test set.
Robustness to sequence length variation within the 10–100 aa range.
Generalisation across diverse AMP families and taxonomic origins.
Regression capability assessed by MSE, R², Pearson correlation, and Kendall's tau.

Intended Use

This tool is intended for research and educational purposes. It provides computational predictions to guide experimental work but does not replace laboratory validation. Predictions should be interpreted in the context of the reported model metrics.

Our Team

Ali Abdalhalim

Computational Biologist

Ahmed Amr

Computational Biologist

Prof. Eman Badr

Full Professor & Director of BCBU

Acknowledgements

Bioinformatics and Computational Biology Unit (BCBU), Zewail City
The Centre for Genomics, Zewail City

Contact

For questions, collaboration inquiries, or feedback: epicamp.sup@gmail.com

How to Use EPIC-AMP

1

Prepare your sequence

Ensure your peptide sequence uses only standard amino acid single-letter codes (ACDEFGHIKLMNPQRSTVWY). Length must be between 10 and 100 residues. For FASTA format, ensure the file has a > header line followed by the sequence.
2

Enter or upload

Type/paste directly into the text area, or use the file picker to upload a .fasta, .fa, or .fna file. As you type, the Sequence Property Viewer will colour each residue by its biophysical properties and the composition bar will update in real-time.
3

Select target bacteria (optional)

If you want MIC predictions, tick one or more bacteria in the selection panel. These checkboxes activate automatically once a valid sequence is entered. MIC prediction only runs if the peptide is classified as an AMP.
4

Submit the analysis

Enter your email address and click Submit Analysis. The button will display elapsed processing time. Analysis typically completes within ~30 seconds.
5

Interpret the Results Dashboard

The Classification panel shows AMP or Non-AMP. The Confidence Gauge displays model certainty. The MIC Chart shows bar-chart predictions (µM) for selected organisms. Download the full PDF report including LIME explanation and global SHAP plot.
6

Clear and repeat

Click Clear to reset all fields and results before analysing a new sequence.

Troubleshooting

Issue	Likely Cause	Solution
Invalid characters error	Non-standard AA characters (B, J, O, U, X, Z or symbols)	Remove or replace with valid residues
Length out of range	Sequence <10 or >100 characters	Trim or extend to within 10–100 aa
FASTA parse error	Malformed FASTA file	Ensure file starts with >header line and contains only the sequence on subsequent lines
Prediction timeout	HuggingFace Space may be cold-starting	Wait ~60s and retry; Space auto-resumes

AMP & MIC Predictor

A. Sequence-Based Features

APAAC13 & APAAC5 — Amphiphilic Pseudo-Amino Acid Composition

Amino Acid Composition (M, C)

B. Structural & Biophysical Features

C. Geary Autocorrelation Descriptors

Prepare your sequence

Enter or upload

Select target bacteria (optional)

Submit the analysis

Interpret the Results Dashboard

Clear and repeat

EPIC-AMP — Live Demo