Protein-ligand interaction analyzer (PLIA) is a structural bioinformatics tool to validate AI generated biomolecular complex structures.
This tool extracts the interaction interface of protein, RNA, and DNA complexes and can compare it to known sequences of binding motifs and interaction domains or to interfaces of other complexes.
Suggested Citation
Liang F, Srinivasan S, Chang HY. PLI Analyzer for Data-driven Validation of AI Predicted Biomolecular Interfaces. bioRxiv. 2025:2025-09.
bioRxiv
https://doi.org/10.1101/2025.09.29.678845
- Install miniconda3.
- Clone the PLIA repository.
git clone https://github.com/SuhasSrinivasan/plia.git
- Navigate to the directory
cd plia
- Install dependencies
conda env create -f environment.yml
- Activate the environment:
conda activate plia
Voronota calculates atom contacts through the Voronoi diagram of atomic balls.
Access the GitHub repository here.
- Download the latest package
wget https://github.com/kliment-olechnovic/voronota/releases/download/v1.29.4198/voronota_1.29.4198.tar.gz
OR if wget is not installed use the below command:
curl -LO https://github.com/kliment-olechnovic/voronota/releases/download/v1.29.4198/voronota_1.29.4198.tar.gz
- Unpack the package
tar -xf voronota_1.29.4198.tar.gz
rm voronota_1.29.4198.tar.gz
- Change to the package directory
cd ./voronota_1.29.4198
- Run CMake
cmake . -DEXPANSION_JS=ON -DEXPANSION_LT=ON
- Compile everything
make
- Install everything
sudo make install
This wrapper script orchestrator.py automates the workflow for processing protein complex structures and analyzing their interaction interfaces with PLI-analyzer.
It coordinates several helper scripts in sequence, ensuring that input and output files are handled correctly across subdirectories.
The pipeline expects a base directory with subdirectories for each protein complex to process, named in the format:
uniprotid1-uniprotid2
Each subdirectory is processed independently, and results are consolidated at the base directory level.
cif_to_pdb.py
Converts .cif files to .pdb.
new_generate_ref.py
Generates reference files (ref_file_updated.csv) inside each subdirectory, representing known interaction sites for the given interactor.
These interaction sites are determined by cross-referencing the Uniprot ID with Uniprot and InterPro annotations, represented by the CSVs tables/human_interpro_ppi_domains_consolidated_min3.csv and tables/human_uniprot_ppi_sites_min3.csv
extract_interface.py
For each subdirectory:
- Runs interface extraction.
- Filters sequences by interaction length, default = 3.
python extract_interface.py
--input_dir <subdir>
--output_dir <subdir>
--path_to_voronota <voronota_path>
--ref_file <subdir>/ref_file_updated.csv
[--inter_output]
summarize_sequences.py
Aggregates results into a summary file at the base directory.
python summarize_sequences.py <base_dir> --output_csv summary.csv
extra_summary.py
Performs an additional summarization step for related information on IDR % and length.
python orchestrator.py
--base_dir <path to base dir with subdirs>
--voronota_path <path to voronota executable>
[--min_interaction_length 3]
[--keep_extract_interface]
[--inter_output]
-
--base_dir
Path to the base directory containing subdirectories for each complex. -
--voronota_path
Path to the Voronota executable.
-
--min_interaction_length(default:3)
Minimum interaction length. The wrapper internally adds+2before passing the value tofilter_sequences.py. -
--keep_extract_interface
If set,extract_interface.pywill remain in each subdirectory after execution. -
--inter_output
If set, adds--inter_outputto eachextract_interface.pyrun. This generates intermediate CSV files with residue-level distances.
ref_file_updated.csv- Extracted interface results
- (Optional) Intermediate residue-level CSVs (
--inter_output)
summary.csv(fromsummarize_sequences.py)- Additional outputs from
extra_summary.py
- Subdirectories must follow the naming convention:
- All helper scripts must be available in the same directory as the wrapper (
orchestrator.py). - The wrapper stops execution if any required reference file (
ref_file_updated.csv) is missing in a subdirectory.
| Parameter | Description |
|---|---|
input_dir |
Path to the input folder with the PDF/CIF files. |
output_dir |
Path to where the output files should be created. |
path_to_voronota |
Path to the directory where Voronota is installed. |
ref_file |
Name of CSV file containing known/putative interacting sequences (optional, see example file). |
area |
Minimum interaction area threshold for the Voronoi diagram of atomic balls (optional, default = 0.01) |
distance |
Distance threshold (in Angstroms) for interacting residues (optional, default = None). |
padding |
Number of residues added as padding on either side of the actual interacting residue (optional, default = 1). |
inter_output |
Create intermediate CSV files with residue-level distances (optional, default = False). |
python extract_interface.py --input_dir <path_to_dir> --output_dir <path_to_dir> --path_to_voronota <path_to_voronota> --ref_file <path_to_ref_file> --padding 2
Each column of the CSV corresponds to a protein chain in the PDB/CIF files.
Intermediate CSV: A file is produced for each input PDB/CIF file, and can be toggled ON by the inter_output argument.