thread_chunks.chunk

thread_chunks.chunk(func: Callable | None, parameters: List[List[Any]], actors: List[ObjectRef] | None = None, chunk_size: int = 2, progress_bar: bool = True, path: str | None = None) List[Any][source]

Calls func in parallel, a chunk at a time, for each set of *args in parameters and returns the outputs in an ordered list. As only chunk_size threads are running func at any given time memory intensive operations will not exhaust the RAM capacity.

Alternatively, actors can be passed a list of RemoteLabelledActor if func is passed None and the functions stored in each RemoteLabelledActor will be used.

Parameters:
  • func (Optional[Callable]) – The function to parallelize. If None then the function saved in the actors is used.

  • parameters (List[List[Any]]) – A list of *args to be passed to func.

  • actors (Optional[List[ObjectRef]]) – A list of ray actors of type RemoteLabelledActor to use for the computation. By default None.

  • chunk_size (int, optional) – The number of threads to launch simultaneously (a chunk). By default config.default_chunk_size.

  • progress_bar (bool, optional) – Whether to display a progress bar in the terminal. By default config.progress_bar.

  • path (str, optional) – If specified then after each new data point a checkpoint is saved to the path. If the checkpoint file exists when chunk() is called then chunk() will pick up from the checkpoint.

Returns:

A List of the outputs of func. The ith element corresponds to the ith element of parameters [func(*parameters[i])].

Return type:

List[Any]

Raises:
  • ValueError – “func and actors cannot both be None.”

  • ValueError – “chunk_size does not agree with the number of actors. Note that len(actors) must equal chunk_size.”

Warns:
  • CheckpointFailedWarning – The checkpointing file has no saved argument values. Proceeding and continuing to save data, but you will not be able to pick up from this checkpoint.

  • CheckpointFailedWarning – There was an error while saving the data.