Skip to content

A command line Python interface implementing simplified BLAST and Smith-Waterman algorithms, developed for a university course.

License

Notifications You must be signed in to change notification settings

cemileblks/blast101

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔬 Blast101 – A Simplified BLAST & Smith-Waterman

Python License: MIT Platform Purpose

Note: This repository is based on teaching materials from the Bioinformatics Algorithms course at the University of Edinburgh (Spring 2025). See full credits below.

Description

Blast101 is a Python command-line tool that mimics the core functionality of the BLAST algorithm alongside Smith-Waterman local alignment. It was initially developed for teaching purposes and uses fundamental bioinformatics algorithms, simple Python logic, and custom scoring heuristics.

This repository represents my extension of the original teaching code through:

  • Building a command-line interface (CLI) with multiple modes
  • Writing unit tests for key modules
  • Improving structure, usability, and documentation
  • Adding input validation (e.g., protein/DNA detection)
  • Making the tool runnable from the terminal (outside an IDE)

Table of Contents

Features

  • 🧬 BLAST-like word-based search with configurable word size
  • 🧪 Smith-Waterman local alignment scoring
  • ⚙️ Customisable via settings.ini
  • 🔎 Validates FASTA inputs, including DNA-vs-protein detection
  • 🧪 Unit tests using unittest framework
  • 🖥️ Easy CLI interface with usage guidance

Usage

To run a basic BLAST101 alignment:

python run_blast101.py --query nanog.fasta --database uniprot_bit2.fasta --mode blast --verbose

Available Modes

Mode Description
blast Run BLAST101 alignment
sw Run Smith-Waterman alignment
stats Run statistical scoring evaluation
test Run all unit

To view CLI help and examples:

python run_blast101.py --help

Tests

Run all tests with:

python run_blast101.py --mode test

Test suite includes:

  • Smith-Waterman scoring edge cases
  • FASTA parsing with malformed/partial inputs
  • Dictionary creation from sequences
  • Core BLAST101 functionality

Repository Structure

File/Folder Purpose
run_blast101.py CLI script to control the entire app
blast_101_search.py Heuristic BLAST-like alignment
smith_waterman_p.py Smith-Waterman scoring implementation
process_fasta_file.py FASTA parser and validator
test_*.py Test modules for individual components
programme_settings.py Default configuration
settings.ini Editable scoring parameters
logs/ Logs of alignments and results
nanog.fasta, uniprot_bit2.fasta Example input files
requirements.txt List of dependencies (minimal)

License

This project is licensed under the MIT License

© 2025 Simon Tomlinson, University of Edinburgh

You are free to reuse, modify, and distribute this code under the terms of the license. Please include appropriate attribution in any reuse or derivative work.

Credits

🧑‍🏫 Simon Tomlinson – Original author of the code for the Bioinformatics Algorithms MSc course (2025)

👩‍💻 cemileblks – Code extensions, CLI implementation, tests, and GitHub curation

📘 Based on ICA coursework submitted for grading, later modified for public release

Disclaimer

This repository contains beta teaching code and is not intended for production use in real-world bioinformatics pipelines. Accuracy, speed, and feature-completeness may be limited.

About

A command line Python interface implementing simplified BLAST and Smith-Waterman algorithms, developed for a university course.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages