Mace fixes
- Bugfix in batching logic for infer_only_scheduler
- A quick python script for summarizing action log
- Fixed a controller memory leak in the controller networking code
- Tweaked worker mempool
Controllers need to be careful about freeing input and output memory of InferRequest
(client), InferResponse
(client), Infer
(worker), and InferResult
(worker) in the following ways:
-
InferRequest
from client - the controller network code will allocate memory forinput
. the controller needs to deleteinput
when done -
InferResponse
to client - the controller network code assumes that your controller allocated memory foroutput
, and will deleteoutput
after sending has completed -
Infer
to worker - the controller network code does not allocate or free any memory. you can safely setInfer::input = InferRequest::input
. -
InferResult
from worker - the controller network code will allocate memory foroutput
. you can safely setInferResponse::output = InferResult::output
-
Batching if you batch
InferRequest
then your controller will have to allocate memory for the batch. This means you need to pay attention to the following: -
Batching
Infer
- you will have to allocate memory forinput
. You should wait until you receive anInferResult
before freeing theinput
of the originalInfer
. -
Batching
InferResponse
- you will have to allocate memory for each request'soutput
. As described above, this will be automatically deleted by the controller network code once the response has been sent to the client -
Batching
InferResult
- after copying the batched output to each individualInferResponse
, you will need to free theInferResult::output
.
Edited by Jonathan Mace