TOP Phenotypic Query is a Java Maven package that is responsible for the phenotyping business logic in the TOP Framework (see top-deployment for a documentation of the whole framework). There are multiple functionalities covered by this package:
- provide Java interfaces for developers to implement:
- import and export of phenotype models
- adapters to generate queries in a specific query language (such as SQL or FHIR Search servers)
- generate and execute queries to retriev individual data from data sources
- classify individual data into phenotype classes
- export query results to CSV
Download one of our JAR releases and use it as follows:
# show help message
java -jar top-phenotypic-query-x.x.x.jar query --help
# execute phenotypic queries based on a phenotype model, results are written to ZIP
java -jar top-phenotypic-query-x.x.x.jar query <phenotype model> \
<adapter config> <query config> -o <ZIP output path>Input parameters:
- JSON containing a phenotype model
- YAML adapter configuration
- JSON containing a TOP query (i.e.,
care.smith.top.model.Query) Alternatively, the--phenotypeoption can be used to build a query from the provided phenotype ID. - optional: output destination of the ZIP file that contains the result set (if not provided, output is written to STDOUT)
The next sections provide information on how to add top-phenotypic-query as Maven dependency to your Java project and
how to call it programmatically.
Add the following Maven dependency to your project's pom.xml file and see section Authentication to GitHub Packages
for authentication with GitHub Packages.
<dependency>
<groupId>care.smith.top</groupId>
<artifactId>top-phenotypic-query</artifactId>
<version><!-- the version number --></version>
</dependency>Because the Maven package is hosted at GitHub Packages, you need to make some modifications to your Maven installation in order to download and install the package. Please follow the Authenticating to GitHub Packages instructions.
Authentification is required for care.smith.top:top-phenotypic-query and the dependency care.smith.top:top-api.
Data adapters are used to generate queries in a specific query language to retriev individual data from a source system. Both default adapters for SQL and FHIR Search need configurations that are provided as YAML files.
/src/main/resources/default_adapter_configuration contains default configuration files. By providing custom files (see for example src/test/resources/config), you can override these files.
Query query = new Query(); // some query
Entity[] phenotypes = new Entity[]; // some phenotype definitions
DataAdapterConfig config = DataAdapterConfig.getInstance("path/to/config.yml");
DataAdapter adapter = DataAdapter.getInstance(config);
PhenotypeFinder finder = new PhenotypeFinder(query, phenotypes, adapter);
ResultSet rs = finder.execute();
adapter.close();Below code will create a file export.zip that contains metadata.csv, data_phenotypes.csv and data_subjects.csv.
Query query = new Query; // some phenotypic query
Entity[] phenotypes = new Entity[]; // some phenotype definitions
ResultSet rs; // some resultset
File zipFile = Files.createFile("export.zip").toFile();
ZipOutputStream zipStream = new ZipOutputStream(new FileOutputStream(zipFile));
CSV csvConverter = new CSV();
zipStream.putNextEntry(new ZipEntry("metadata.csv"));
csvConverter.writeMetadata(phenotypes, zipStream);
zipStream.putNextEntry(new ZipEntry("data_phenotypes.csv"));
csvConverter.writePhenotypes(rs, phenotypes, zipStream);
zipStream.putNextEntry(new ZipEntry("data_subjects.csv"));
csvConverter.writeSubjects(rs, phenotypes, query, zipStream);
zipStream.close();metadata.csv: Describing metadata of all phenotype classes contained in the result set.data_phenotypes.csv: Each phenotype is represented as a single row with the corresponding value in one of the columns 'number_value', 'text_value', 'date_time_value' or 'boolean_value'.data_subjects.csv: Subjects (individuals) in the result set are represented as single rows with columns for each phenotype. Phenotypes may occure more than once for a subject and are therefore given as comma separated values.
/* Import */
InputStream inputStream;
PhenotypeImporter importer;
Entity[] phenotypes = importer.read(inputStream);
/* Export */
OutputStream outputStream;
Repository repository; // some repository metadata
String uri = "https://example.com";
PhenotypeExporter exporter;
exporter.write(phenotypes, repository, uri, outputStream);Please see our Contributing Guide.
The code in this repository and the package care.smith.top:top-phenotypic-query are licensed under MIT.
Uciteli A, Beger C, Kirsten T, Meineke FA, Herre H. Ontological representation, classification and data-driven computing of phenotypes. J Biomed Semant. 2020 Dec;11(1):15. https://doi.org/10.1186/s13326-020-00230-0.
Uciteli A, Beger C, Wagner J, Kirsten T, Meineke FA, Stäubert S, et al. Ontological modelling and FHIR Search based representation of basic eligibility criteria. GMS Medizinische Informatik. 2021 Apr 26;Biometrie und Epidemiologie; 17( 2):Doc05. https://doi.org/10.3205/MIBE000219.
Beger C, Matthies F, Schäfermeier R, Kirsten T, Herre H, Uciteli A. Towards an Ontology-Based Phenotypic Query Model. Applied Sciences. 2022 May 21;12(10):5214. https://doi.org/10.3390/app12105214.