Skip to content

nan values for 'study_duration' #2

@JohnYangSam

Description

@JohnYangSam

nan values for 'study_duration'

Using the data from Clinicaltrial.gov's resources here (e.g. https://classic.clinicaltrials.gov/AllPublicXML.zip).

It looks like the system is getting NaN for all the study_duration values.

Tried shortcircuiting line 297 in panels/utils/process.py if `row['StudyDuration'] is None or isnan(row['StudyDuration'][), but it looks like this is popping up for almost all the studies, so there may be an issue with parsing?

Reproduction

  • Follow readme setup instructions
  • Run python3 data_manager.py import -i data/AllPublicXML/
  • Find the following error trace
0%| | 0/459487 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/fields/init.py", line 1823, in get_prep_value
return int(value)
ValueError: cannot convert float NaN to integer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/.../Tri-AL/data_manager.py", line 168, in <module>
_import(args.input)
File "/.../Tri-AL/data_manager.py", line 122, in _import
t = processor.data_mapper(row.to_dict(orient='index')[0])
File "/.../Tri-AL/panels/utils/processor.py", line 320, in data_mapper
t.save()
File "/.../Tri-AL/panels/models.py", line 249, in save
super(Trial, self).save(*args, **kwargs)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/base.py", line 739, in save
self.save_base(using=using, force_insert=force_insert,
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/base.py", line 776, in save_base
updated = self._save_table(
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/base.py", line 881, in _save_table
results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/base.py", line 919, in _do_insert
return manager._insert(
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/query.py", line 1270, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1415, in execute_sql
for sql, params in self.as_sql():
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1358, in as_sql
value_rows = [
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1359, in <listcomp>
[self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1359, in <listcomp>
[self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1300, in prepare_value
value = field.get_db_prep_save(value, connection=self.connection)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/fields/init.py", line 842, in get_db_prep_save
return self.get_db_prep_value(value, connection=connection, prepared=False)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/fields/init.py", line 837, in get_db_prep_value
value = self.get_prep_value(value)
File "/.../Tri-AL/venv/lib/python3.10/site-packages/django/db/models/fields/init.py", line 1825, in get_prep_value
raise e.class(
ValueError: Field 'study_duration' expected a number but got nan.

Separately, love the idea behind this project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions