Skip to content

htcondor 24.2.2 broken #38

@nsmith-

Description

@nsmith-

As of htcondor 24.2.2 we get a failure to spool some files:

ERROR:lpcjobqueue.cluster:DCSchedd::spoolJobFiles:7002:File transfer failed for target job 73893203.0: TOOL at 131.225.190.225 failed to send file(s) to <131.225.189.168:9618>: |Error: sending file
 /uscms/home/ncsmith/x509up_u49040; SCHEDD at 131.225.189.168 - |Error: receiving file /storage/local/data1/condor/spool/3203/0/cluster73893203.proc0.subproc0.tmp/x509up_u49040
2024-12-11 21:15:17,379 - tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x
7efe1aecf550>>, <Task finished name='Task-106' coro=<SpecCluster._correct_state_internal() done, defined at /usr/local/lib/python3.10/site-packages/distributed/deploy/spec.py:346> exception=Asserti
onError()>)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
  File "/usr/local/lib/python3.10/site-packages/tornado/ioloop.py", line 774, in _discard_future_result
    future.result()
  File "/usr/local/lib/python3.10/site-packages/distributed/deploy/spec.py", line 390, in _correct_state_internal
    await asyncio.gather(*worker_futs)
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 650, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/usr/local/lib/python3.10/site-packages/distributed/deploy/spec.py", line 75, in _
    assert self.status == Status.running
AssertionError

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions