Skip to content

SuhasSrinivasan/plia

Repository files navigation

PLIA

Protein-ligand interaction analyzer (PLIA) is a structural bioinformatics tool to validate AI generated biomolecular complex structures.

Overview

This tool extracts the interaction interface of protein, RNA, and DNA complexes and can compare it to known sequences of binding motifs and interaction domains or to interfaces of other complexes.

Suggested Citation
Liang F, Srinivasan S, Chang HY. PLI Analyzer for Data-driven Validation of AI Predicted Biomolecular Interfaces. bioRxiv. 2025:2025-09.

bioRxiv
https://doi.org/10.1101/2025.09.29.678845

Installation

  1. Install miniconda3.
  2. Clone the PLIA repository.
git clone https://github.com/SuhasSrinivasan/plia.git
  1. Navigate to the directory
cd plia
  1. Install dependencies
conda env create -f environment.yml
  1. Activate the environment:
conda activate plia

Installing Voronota

Voronota calculates atom contacts through the Voronoi diagram of atomic balls.

Access the GitHub repository here.

Quick Install Guide

  1. Download the latest package
wget https://github.com/kliment-olechnovic/voronota/releases/download/v1.29.4198/voronota_1.29.4198.tar.gz

OR if wget is not installed use the below command:

curl -LO https://github.com/kliment-olechnovic/voronota/releases/download/v1.29.4198/voronota_1.29.4198.tar.gz
  1. Unpack the package
tar -xf voronota_1.29.4198.tar.gz
rm voronota_1.29.4198.tar.gz
  1. Change to the package directory
cd ./voronota_1.29.4198
  1. Run CMake
cmake . -DEXPANSION_JS=ON -DEXPANSION_LT=ON
  1. Compile everything
make
  1. Install everything
sudo make install

PLIA Usage

Overview

This wrapper script orchestrator.py automates the workflow for processing protein complex structures and analyzing their interaction interfaces with PLI-analyzer.
It coordinates several helper scripts in sequence, ensuring that input and output files are handled correctly across subdirectories.

The pipeline expects a base directory with subdirectories for each protein complex to process, named in the format:

uniprotid1-uniprotid2

Each subdirectory is processed independently, and results are consolidated at the base directory level.

Workflow

  1. cif_to_pdb.py

Converts .cif files to .pdb.

  1. new_generate_ref.py

Generates reference files (ref_file_updated.csv) inside each subdirectory, representing known interaction sites for the given interactor.
These interaction sites are determined by cross-referencing the Uniprot ID with Uniprot and InterPro annotations, represented by the CSVs tables/human_interpro_ppi_domains_consolidated_min3.csv and tables/human_uniprot_ppi_sites_min3.csv

  1. extract_interface.py

For each subdirectory:

  • Runs interface extraction.
  • Filters sequences by interaction length, default = 3.
python extract_interface.py
--input_dir <subdir>
--output_dir <subdir>
--path_to_voronota <voronota_path>
--ref_file <subdir>/ref_file_updated.csv
[--inter_output]
  1. summarize_sequences.py

Aggregates results into a summary file at the base directory.

python summarize_sequences.py <base_dir> --output_csv summary.csv
  1. extra_summary.py

Performs an additional summarization step for related information on IDR % and length.

orchestrator.py Usage

python orchestrator.py
--base_dir <path to base dir with subdirs>
--voronota_path <path to voronota executable>
[--min_interaction_length 3]
[--keep_extract_interface]
[--inter_output]

Required Arguments

  • --base_dir
    Path to the base directory containing subdirectories for each complex.

  • --voronota_path
    Path to the Voronota executable.

Optional Arguments

  • --min_interaction_length (default: 3)
    Minimum interaction length. The wrapper internally adds +2 before passing the value to filter_sequences.py.

  • --keep_extract_interface
    If set, extract_interface.py will remain in each subdirectory after execution.

  • --inter_output
    If set, adds --inter_output to each extract_interface.py run. This generates intermediate CSV files with residue-level distances.

Outputs

Per Subdirectory

  • ref_file_updated.csv
  • Extracted interface results
  • (Optional) Intermediate residue-level CSVs (--inter_output)

Base Directory

  • summary.csv (from summarize_sequences.py)
  • Additional outputs from extra_summary.py

Notes

  • Subdirectories must follow the naming convention:
  • All helper scripts must be available in the same directory as the wrapper (orchestrator.py).
  • The wrapper stops execution if any required reference file (ref_file_updated.csv) is missing in a subdirectory.

As standalone extract_interface.py script

Input Parameters

Parameter Description
input_dir Path to the input folder with the PDF/CIF files.
output_dir Path to where the output files should be created.
path_to_voronota Path to the directory where Voronota is installed.
ref_file Name of CSV file containing known/putative interacting sequences (optional, see example file).
area Minimum interaction area threshold for the Voronoi diagram of atomic balls (optional, default = 0.01)
distance Distance threshold (in Angstroms) for interacting residues (optional, default = None).
padding Number of residues added as padding on either side of the actual interacting residue (optional, default = 1).
inter_output Create intermediate CSV files with residue-level distances (optional, default = False).

Example Usage

python extract_interface.py --input_dir <path_to_dir> --output_dir <path_to_dir> --path_to_voronota <path_to_voronota> --ref_file <path_to_ref_file> --padding 2 

Reference File Format

Each column of the CSV corresponds to a protein chain in the PDB/CIF files.
Screen Shot 2024-07-11 at 11 47 13 AM

Output

Intermediate CSV: A file is produced for each input PDB/CIF file, and can be toggled ON by the inter_output argument.
Screen Shot 2024-08-12 at 10 44 24 AM

Final CSV with interacting sequences.
Screen Shot 2024-08-12 at 10 51 42 AM

About

Structural bioinformatics tool to validate AI generated biomolecular complex structures.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •  

Languages