Skip to content

Issue with doing multiple time series on Databricks #22

@choward456

Description

@choward456

I am trying to get the features for a bunch of time series, but I keep running into the error below. For example, when I run over 12,000 time series, the first 9,000 work fine, however the loop breaks and kills the kernel when it gets around 9,000. I tried just doing the last 3,000, however, the error still pops up. It works for every time series in the last 3,000 if I groupby and apply the method I made one at a time. The issue appears when I put it in a for loop. It will run a couple of the time series and then this error appears. I have also tried it on different cluster set ups with varying sizes and workers and the error still pops up. Any help would be greatly appreciated. Thanks!

rounds = int((issues_df['time_series_idx'].nunique()))
for i in range(0,rounds):
  reduced_df = issues_df[(issues_df['time_series_idx'].isin([issues_df['time_series_idx'].unique()[i]]))]
  features_df = reduced_df.groupby(['run_id']).apply(catch_24) #works by itself when I do one time series at a time
  features.append(features_df)


Fatal error: The Python kernel is unresponsive.
---------------------------------------------------------------------------
The Python process exited with exit code 139 (SIGSEGV: Segmentation fault).



The last 10 KB of the process's stderr and stdout can be found below. See driver logs for full logs.
---------------------------------------------------------------------------
Last messages on stderr:
y", line 1016 in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00007fb746ffe640 (most recent call first):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 114 in worker
  File "/usr/lib/python3.10/threading.py", line 953 in run
  File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions