Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
__pycache__
*.csv
test.csv
output.csv
data/*
30 changes: 30 additions & 0 deletions RequirementsDesignNotes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Design notes

## Requirements

### Extraction from XML
All `<P>` tags contained in `<body><sec>`, with the exception of the following:
- [x] `<P>` contained in a section with type `scope` or `terms`
- [ ] `<P>` contained in a section with id `sec_forword`
- [ ] ... ?
### CSV format
The columns of the CSV format are the following:

Req_UUID,Text,Standard,Section
The CSV is UTF-8 encoded.
The Req_UUID is unique for every requirement. It is also new for every iteration of the script. It is however used to link requirements to additional information sets. (See the next section)

## Link Requirements with Additional Information such as Tables, etc.

### Extraction from XML
The following information elements are extracted:
- [ ] References to other Norms.
- [ ] References to tables/numerated lists/unnumerated lists with a ID link.
- [ ] tables/numerated lists/unnumerated in the same section as a paragraph without any ID link pointing to it.
- [ ] ... ?

### CSV Format
The format of the CSV output file is the following:

Req_UUID,AdditionalInfo_Type, AdditionalInfo_UUID, StandardID, SectionID, ID, AdditionalInfo_Body.
References to norms only have the Req_UUID, AdditionalInfo_Type and StandardID filled in. The CSV is encoded in UTF-8. The Req_UUID is the same as the UUID used in the first iteration, if and only if both CSV files are created in the same iteration.
Loading