thread_chunks.Chunker¶
- class thread_chunks.Chunker(func: Callable | None = None, actors: List[ObjectRef] | None = None, chunk_size: int = 2, progress_bar: bool = True, persistent: bool = False, path: str | None = None)[source]¶
Bases:
objectA collection of
RemoteLabelledActors pre-loaded with a function to execute that can be reused. Unlikechunk()this means that the function to parallelise only needs to be copied when the function is first executed in parallel and does not need to be copied to theRemoteLabelledActors every time.Notes
If
rayis not initialised whenChunkeris initialised thenChunkerwill initialiseray. In this case whenChunkeris deleted it willshutdownrayunless the attributepersistentis set toTrue.Attributes
The number of threads to launch simultaneously (a chunk).
The
RemoteLabelledActors theChunkerwill use to execute tasks.Whether to display a progress bar in the terminal.
Whether to allow the
rayinstance (if created by thisChunker) to persist after thisChunkeris deleted.The path of the checkpointing file associated with this
Chunker.Methods
The functions stored in each
RemoteLabelledActorinactorsis called in parallel, a chunk at a time, for each set of*argsin parameters and returns the outputs in an orderedlist.Initialises a
Chunker.- __call__(parameters: List[List[Any]], func: Callable | None = None, progress_bar: bool | None = None, path: str | None = None) List[Any][source]¶
The functions stored in each
RemoteLabelledActorinactorsis called in parallel, a chunk at a time, for each set of*argsin parameters and returns the outputs in an orderedlist. As only chunk_size threads are running func at any given time memory intensive operations will not exhaust the RAM capacity.Alternatively, func can be passed to override the functions stored in each
RemoteLabelledActorofactors.- Parameters:
parameters (List[List[Any]]) – A list of
*argsto pass to the function to be executed in parallel.func (Callable, optional) – The function to be executed in parallel. If
Nonethen the function stored in eachRemoteLabelledActorinactorsis used. Else the functions stored in theactorsare replaced with func. By defaultNone.progress_bar (bool, optional) – Whether to display a progress bar in the terminal. If
Noneis passed thenprogress_baris used. By defaultNone.path (str, optional) – The path to save the checkpoints to. By default
None.
- Returns:
A List of the outputs of func. The
ith element corresponds to theith element of parameters [func(*parameters[i])].- Return type:
List[Any]
- Warns:
CheckpointFailedWarning – The checkpointing file has no saved argument values. Proceeding and continuing to save data, but you will not be able to pick up from this checkpoint.
CheckpointFailedWarning – There was an error while saving the data.
- __init__(func: Callable | None = None, actors: List[ObjectRef] | None = None, chunk_size: int = 2, progress_bar: bool = True, persistent: bool = False, path: str | None = None)[source]¶
Initialises a
Chunker.- Parameters:
func (Callable, optional) – The function to be executed in parallel. Note func and actors cannot be passed together. By default
None.actors (List[ObjectRef], optional) – A list of remote
RemoteLabelledActors. If actors isNonethen func will be used to generate a set of chunk_sizeRemoteLabelledActors. Note func and actors cannot be passed together. By defaultNone.chunk_size (int) – The number of threads to launch simultaneously (a chunk). By default
config.default_chunk_size.progress_bar (bool) – Whether to display a progress bar in the terminal. By default
config.progress_bar.persistent (bool) – Whether to allow the ray instance (if created by this
Chunker) to persist after thisChunkeris deleted. By defaultFalse.path (str, optional) – The path of the checkpointing file associated with this
Chunker. By defaultNone.
- Raises:
ValueError – “Either func or actors must be specified but not both.”
ValueError – “chunk_size does not agree with the number of actors. Note that
len(actors)must equal chunk_size.”
- actors: List[ObjectRef]¶
The
RemoteLabelledActors theChunkerwill use to execute tasks.
- property chunk_size: int¶
The number of threads to launch simultaneously (a chunk).
- persistent: bool¶
Whether to allow the
rayinstance (if created by thisChunker) to persist after thisChunkeris deleted.
- progress_bar: bool¶
Whether to display a progress bar in the terminal.