check_workload_cpu_gpu
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
check_workload_cpu_gpu [2016/06/06 15:50] – hj | check_workload_cpu_gpu [2017/07/10 20:06] (current) – hj | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ==== Check CPUs workload: | + | === Check CPUs workload: === |
- | 1. We have a simple script for you to check the workload of all machines, you may run: | + | |
- | // | + | - Run Linux ' |
- | / | + | - If your program consumes lots of memory (over 10G), DON’T submit it more than once to a single machine. |
- | // | + | |
- | Every time when you submit a new job, please use this command to look for a free or light-loaded machine. For a 6-core machine, we normally should not have its workload over 6. | + | |
- | 2. Run Linux ' | + | === Check GPUs workload: === |
- | | + | |
+ | - To check one server equipped with GPU, the GPU summary can be retried by “**nvidia-smi**”. As long as the remaining memory meets your memory | ||
+ | In most machine learning framework, the first GPU is picked by default. Tensorflow, for example, will pre-allocate a chunk of memory on EVERY SINGLE GPU if you don’t explicitly mask the unneeded. Masking can be done by, for example “**setenv CUDA_VISIBLE_DEVICES 1**”, if you only want to expose the second GPU (GPU is 0-indexing). | ||
+ | |||
+ | {{: |
check_workload_cpu_gpu.1465228253.txt.gz · Last modified: 2016/06/06 15:50 by hj