tmux
I found that, if someone tries to run parallel computing in AWS, and calls cuda multiple times, the system will crash and the python session will terminate.
One of the possible solutions is to use tmux
. It allows the user to open a computing session at the background, so that if the user logout of the aws, he or she can still login and see the same session again.
tmux new -t session-name
tmux attach -t session-name
tmux kill-session -t session-name
tmux new -d -s session-name "source web/bin/activate ; python3 main.py"
multiprocessing
import multiprocessing as mp
print(mp.cpu_count())
with mp.Pool(processes=4) as pool:
pool.map(f,range(10))
def chunks(lst,cores):
return [lst[cores*i:cores*(i+1)] for i in range(len(lst)//cores+1)]
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
torch .to(device)
It is possible to load the data through cpu, forward pass the data via model 1 by gpu 1, and forward pass the data via model 2 gpu 2. The user can do this by specifying the device. device = 'cpu' or 'cuda:0' or 'cuda:1' or so on