Skip to content

HOWTO: Parse 3D structure

Tomasz Żok edited this page Oct 27, 2020 · 2 revisions

This snippet of code below shows how to parse 3D structural data.

    // will be used later to format deposition date
    final SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd", Locale.US);

    // read PDB or mmCIF data from file
    final File file =  new File("1EHZ.pdb");
    final String structureContent = FileUtils.readFileToString(file, StandardCharsets.UTF_8);

    // parse the data
    final PdbParser parser = new PdbParser();
    final List<PdbModel> models = parser.parse(structureContent);

    // focus on the first model only
    final PdbModel firstModel = models.get(0);
    
    // print header information
    System.out.println("PDB id: " + firstModel.idCode());
    System.out.println("Title: " + firstModel.title());
    System.out.println("Classification: " + firstModel.header().classification());
    System.out.println(
        "Deposition date: " + dateFormat.format(firstModel.header().depositionDate()));
    System.out.println(
        "Solved with: "
            + firstModel.experimentalData().experimentalTechniques().get(0).getPdbName());
    System.out.println("Resolution: " + firstModel.resolution().resolution());
    
    // print information about modified residues
    System.out.println("Modified residues:");
    firstModel
        .modifiedResidues()
        .forEach(
            modification ->
                System.out.printf(
                    "  %s.%d %s ~ %s -> %s%n",
                    modification.chainIdentifier(),
                    modification.residueNumber(),
                    modification.residueName(),
                    modification.standardResidueName(),
                    modification.comment()));

The result:

PDB id: 1EHZ
Title: THE CRYSTAL STRUCTURE OF YEAST PHENYLALANINE TRNA AT 1.93 A RESOLUTION
Classification: RNA
Deposition date: 2000-02-23
Solved with: X-RAY DIFFRACTION
Resolution: 1.93
Modified residues:
  A.10 2MG ~ G -> 2N-METHYLGUANOSINE-5'-MONOPHOSPHATE      
  A.16 H2U ~ U -> 5,6-DIHYDROURIDINE-5'-MONOPHOSPHATE      
  A.17 H2U ~ U -> 5,6-DIHYDROURIDINE-5'-MONOPHOSPHATE      
  A.26 M2G ~ G -> N2-DIMETHYLGUANOSINE-5'-MONOPHOSPHATE    
  A.32 OMC ~ C -> O2'-METHYLYCYTIDINE-5'-MONOPHOSPHATE     
  A.34 OMG ~ G -> O2'-METHYLGUANOSINE-5'-MONOPHOSPHATE     
  A.37 YYG ~ G ->                                          
  A.39 PSU ~ U -> PSEUDOURIDINE-5'-MONOPHOSPHATE           
  A.40 5MC ~ C -> 5-METHYLCYTIDINE-5'-MONOPHOSPHATE        
  A.46 7MG ~ G ->                                          
  A.49 5MC ~ C -> 5-METHYLCYTIDINE-5'-MONOPHOSPHATE        
  A.54 5MU ~ U -> 5-METHYLURIDINE 5'-MONOPHOSPHATE         
  A.55 PSU ~ U -> PSEUDOURIDINE-5'-MONOPHOSPHATE           
  A.58 1MA ~ A ->                                          

Clone this wiki locally