Inference Time

#287

by XxLOLxX - opened about 1 month ago

about 1 month ago

•

Any One Can help make the Inference time faster? I have a vm with 4 T4 GPUs.
My script passes to the model:
1-Orignial Query
2-User's Question
3-10 instructions to follow

The inference time really varies some requests take 7:10 secs
other may take 50:60 secs

Any ideas how can i use the 4 GPU's for a fast response

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment