Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,7 +352,7 @@ To run a classifier with an additional category or remove an existing one, a cor

## Metadata Support

SOMEF supports the extraction and analysis of metadata in package files of several programming languages. Current support includes: `setup.py` and `pyproject.toml` for Python, `pom.xml` for Java, `.gemspec` for Ruby, `DESCRIPTION` for R, `bower.json` for JavaScript, HTML or CSS, `.cabal` for Haskell, `cargo.toml` for RUST, `composer` for PHP, `.juliaProject.toml` for Julia , `AUTHORS`, `codemeta.json`, and `citation.cff`
SOMEF supports the extraction and analysis of metadata in package files of several programming languages. Current support includes: `setup.py` and `pyproject.toml` for Python, `pom.xml` for Java, `.gemspec` for Ruby, `DESCRIPTION` for R, `bower.json` for JavaScript, HTML or CSS, `.cabal` for Haskell, `cargo.toml` for RUST, `composer` for PHP, `.juliaProject.toml` for Julia , `AUTHORS`, `codemeta.json`, `publiccode.yml`, `dockerfile` and `citation.cff`
This includes identifying dependencies, runtime requirements, and development tools specified in project configuration files.

## Limitations
Expand Down
1 change: 1 addition & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,7 @@ The following formats for a result value are currently recognized:
- `readthedocs`: documentation format used by many repositories in order to describe their projects.
- `wiki`: documentation format used in GitHub repositories.
- `setup.py`: package file format used in python projects.
- `publiccode.yml`: metadata file used to describe public sector software projects.
- `pyproject.toml`: package file format used in python projects.
- `pom.xml`: package file used in Java projects.
- `package.json`: package file used in Javascript projects.
Expand Down
133 changes: 133 additions & 0 deletions docs/publiccode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
The following metadata fields can be extracted from a publiccode.yml file.
These fields are defined in the [PublicCode specification](https://yml.publiccode.tools/), currently at version **0.5.0**, and are mapped according to the [CodeMeta crosswalk for publiccode.yml](https://codemeta.github.io/crosswalk/publiccode/) and [csv CodeMeta crosswalk for publiccode.yml](https://github.com/codemeta/codemeta/blob/master/crosswalks/publiccode.csv)

| Software metadata category | SOMEF metadata JSON path | PUBLICCODE.YML metadata file field |
|-----------------------------|---------------------------------------|----------------------------------------|
| application_domain | application_domain[i].result.value | categories or description.[lang].genericName *(1)* |
| code_repository | code_repository[i].result.value | url |
| date_published | date_published[i].result.value | releaseDate |
| date_updated | date_updated[i].result.value | releaseDate |
| description | description[i].result.value | description.[lang].shortDescription or description.[lang].longDescription *(2)* |
| development_status | development_status[i].result.value | developmentStatus |
| has_package_file | has_package_file[i].result.value | URL of the publiccode.yml file |
| keywords | keywords[i].result.value | description.[lang].features *(3)* |
| license - value | license[i].result.value | legal.license *(4)* |
| license - spdx id | license[i].result.spdx_id | legal.license extract spdx id *(4)*|
| license - name | license[i].result.name | legal.license extract name *(4)* |
| name | name[i].result.value | name or description.[lang].localisedName *(5)* |
| requirements - value | requirements[i].result.value | dependsOn.open / dependsOn.proprietary / dependsOn.hardware name + version *(6)* |
| requirements - name | requirements[i].result.name | dependsOn.open / dependsOn.proprietary / dependsOn.hardware name *(6)* |
| requiriments - version | requirements[i].result.version | dependsOn.open / dependsOn.proprietary / dependsOn.hardware more than one label of version *(6)* |
| runtime_platform | runtime_platform[i].result.value | platforms |
| version | version[i].result.value | softwareVersion |

---

*(1)*
- Example:
```
categories:
- data-collection
- it-development
```
or
```
description:
nl:
genericName: API component
```

*(2)*
- Example:
```
description:
nl:
shortDescription: API voor het beheren van objecten
longDescription: >
De **Objecten API** heeft als doel om uiteenlopende objecten eenvoudig te kunnen
registreren en ontsluiten in een gestandaardiseerd formaat. De Objecten API kan
....`_.

en:
shortDescription: API to manage objects
longDescription: >
The **Objects API** aims to easily store various objects and make them available in
standardized format. The Objects API can be used by any organization to manage
relevant objects. An organization can also choose to use the Objects API to
....`_.
```

*(3)*
- Example:
```
description:
nl:
features:
- Objecten API
- Minimalistische objecten beheerinterface
en:
features:
- Objects API
- Minimalistic object management interface
```


*(4)*
- Look for expressions in a local dictionary with all the reference and spdx_id
- Example:
```
legal:
license: AGPL-3.0-or-later
mainCopyrightOwner: City of Chicago
repoOwner: City of Chicago
```
-Result:
```
'result':
{'value': 'AGPL-3.0-or-later',
'spdx_id': 'AGPL-3.0',
'name': 'GNU Affero General Public License v3.0',
'type': 'License'},

```

*(5)*
- Example:
`name: Medusa`
or
```
description:
en:
localisedName: Medusa
```


*(6)*
- Examples:
```
dependsOn:
open:
- name: Objecttypes API
optional: false
version: '1.0'
- name: MySQL
versionMin: "1.1"
versionMax: "1.3"
optional: true
- name: PostgreSQL
versionMin: "14.0"
optional: true
```

- Result PostgreSQL:
```
"result": {
"value": "PostgreSQL>=14.0",
"name": "PostgreSQL",
"version": ">=14.0",
"type": "Software_application",
"dependency_type": "runtime"
},
```


2 changes: 1 addition & 1 deletion docs/supported_languages.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@ SoMEF also detects the following files to recognize build instructions, workflow
| Jupyter Notebook | `*.ipynb` | executable_example |
| Ontologies | `*.ttl`, `*.owl`, `*.nt`, `*.xml`, `*.jsonld` | ontologies |
| Shell | `*.sh` | has_script_file |
| YAML | `*.yml`, `*.yaml` | continuous_integration, workflows
| YAML | `*.yml`, `*.yaml` | continuous_integration, workflows, publiccode |


2 changes: 1 addition & 1 deletion docs/supported_metadata_files.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ SOMEF can extract metadata from a wide range of files commonly found in software
| `cargo.toml` | Rust | Manifest file serves as the package descriptor used in Rust projects | <div align="center">[🔍](./cargo.md)</div> | [📄](https://doc.rust-lang.org/cargo/reference/manifest.html)| |[Example](https://github.com/rust-lang/cargo/blob/master/Cargo.toml) |
| `*.cabal` | Haskell | Manifest file serving as the package descriptor for Haskell projects.| <div align="center">[🔍](./cabal.md)</div> | [📄](https://cabal.readthedocs.io/en/3.10/cabal-package.html)| |[Example](https://github.com/haskell/cabal/blob/master/Cabal/Cabal.cabal) |
| `dockerfile` | Dockerfile | Build specification file for container images that can include software metadata via LABEL instructions (OCI specification).| <div align="center">[🔍](./dockerfiledoc.md)</div> | [📄](https://docs.docker.com/reference/dockerfile/)| |[Example](https://github.com/FairwindsOps/nova/blob/master/Dockerfile) |

| `publiccode.yml` | YAML | YAML metadata file for public sector software projects| <div align="center">[🔍](./publiccode.md)</div> | [📄](https://yml.publiccode.tools//)| |[Example](https://github.com/maykinmedia/objects-api/blob/master/publiccode.yaml) |

> **Note:** The general principles behind metadata mapping in SOMEF are based on the [CodeMeta crosswalk](https://github.com/codemeta/codemeta/blob/master/crosswalk.csv) and the [CodeMeta JSON-LD context](https://github.com/codemeta/codemeta/blob/master/codemeta.jsonld).
> However, each supported file type may have specific characteristics and field interpretations.
Expand Down
Loading