IAC: Difference between revisions
| Line 22: | Line 22: | ||
== User details == | == User details == | ||
'''MOAB Commands''' | |||
Command Description | Command Description | ||
checkjob provide detailed status report for specified job | checkjob provide detailed status report for specified job | ||
checknode provide detailed status report for specified node | checknode provide detailed status report for specified node | ||
mcredctl controls various aspects about the credential objects within Moab | mcredctl controls various aspects about the credential objects within Moab | ||
mdiag provide diagnostic reports for resources, workload, and scheduling | mdiag provide diagnostic reports for resources, workload, and scheduling | ||
mjobctl control and modify job | mjobctl control and modify job | ||
mnodectl control and modify nodes | mnodectl control and modify nodes | ||
mrmctl query and control resource managers | mrmctl query and control resource managers | ||
mrsvctl create, control and modify reservations | mrsvctl create, control and modify reservations | ||
mschedctl modify scheduler state and behavior | mschedctl modify scheduler state and behavior | ||
mshow displays various diagnostic messages about the system and job queues | mshow displays various diagnostic messages about the system and job queues | ||
mshow -a query and show available system resources | mshow -a query and show available system resources | ||
msub scheduler job submission | msub scheduler job submission | ||
resetstats reset scheduler statistics | resetstats reset scheduler statistics | ||
setspri adjust job/system priority of job | setspri adjust job/system priority of job | ||
showbf show current resource availability | showbf show current resource availability | ||
showq show queued jobs | showq show queued jobs | ||
showres show existing reservations | showres show existing reservations | ||
showstart show estimates of when job can/will start | showstart show estimates of when job can/will start | ||
showstate show current state of resources | showstate show current state of resources | ||
showstats show usage statistics | showstats show usage statistics | ||
showstats -f show various tables of scheduling/system performance | showstats -f show various tables of scheduling/system performance | ||
Commands Providing Maui Compatibility | Commands Providing Maui Compatibility | ||
Command Description | Command Description | ||
canceljob cancel job | canceljob cancel job | ||
changeparam change in memory parameter settings | changeparam change in memory parameter settings | ||
diagnose provide diagnostic report for various aspects of resources, workload, and scheduling | diagnose provide diagnostic report for various aspects of resources, workload, and scheduling | ||
releasehold release job defers and holds | releasehold release job defers and holds | ||
releaseres release reservations | releaseres release reservations | ||
runjob force a job to run immediately | runjob force a job to run immediately | ||
sethold set job holds | sethold set job holds | ||
setqos modify job QOS settings | setqos modify job QOS settings | ||
setres set an admin/user reservation | setres set an admin/user reservation | ||
showconfig show current scheduler configuration | showconfig show current scheduler configuration | ||
Revision as of 13:43, 1 February 2008
IAC stands for "Instituto de Astrofísica de Canarias" and hosts the LaPalma supercomputer. LaPalma comprises 256 JS20 compute nodes (blades) and 4 p510 servers. Every blade has two processors at 2.2 GHz running Linux operating system with 4 GB of memory RAM and 40 GB local disk storage. All the servers provide a total of 14 TB of disk storage accessible from every blade through GPFS (Global Parallel File System).
The networks that interconnect the LaPalma are:
- Myrinet Network: High bandwidth network used by parallel applications communications.
- Gigabit Network: Ethernet network used by the blades to mount remotely their root file system from the servers and the network over which GPFS works.
Access
There are 3 login blades:
ssh username@lapalma1.iac.es
ssh username@lapalma2.iac.es
ssh username@lapalma3.iac.es
User details
MOAB Commands
Command Description
checkjob provide detailed status report for specified job
checknode provide detailed status report for specified node
mcredctl controls various aspects about the credential objects within Moab
mdiag provide diagnostic reports for resources, workload, and scheduling
mjobctl control and modify job
mnodectl control and modify nodes
mrmctl query and control resource managers
mrsvctl create, control and modify reservations
mschedctl modify scheduler state and behavior
mshow displays various diagnostic messages about the system and job queues
mshow -a query and show available system resources
msub scheduler job submission
resetstats reset scheduler statistics
setspri adjust job/system priority of job
showbf show current resource availability
showq show queued jobs
showres show existing reservations
showstart show estimates of when job can/will start
showstate show current state of resources
showstats show usage statistics
showstats -f show various tables of scheduling/system performance
Commands Providing Maui Compatibility
Command Description
canceljob cancel job
changeparam change in memory parameter settings
diagnose provide diagnostic report for various aspects of resources, workload, and scheduling
releasehold release job defers and holds
releaseres release reservations
runjob force a job to run immediately
sethold set job holds
setqos modify job QOS settings
setres set an admin/user reservation
showconfig show current scheduler configuration
Queues
"Slurm+ MOAB" (essential information collected from the User's Guide)
% mnsubmit <job_script>
submits a “job script” to the queue system (see below a script example).
% mnq
shows all the jobs submitted.
% mncancel <job_id>
remove his/her job from the queue system, canceling the execution of the processes, if they were already running.
% checkjob <job_id>
obtains detailed information about a specific job, including the assigned nodes and the possible reasons preventing the job from running.
% mnstart <job_id>
shows information about the estimated time for the specified job to be executed.
Avaliable Programs
Submission script
#!/bin/bash # @ job_name = job # @ initialdir = . # @ output = OUTPUT/mpi_%j.out # @ error = OUTPUT/mpi_%j.err # @ total_tasks = 64 # @ wall_clock_limit = 12:00:00 srun /gpfs/apps/NWCHEM/4.7/bin/LINUX64_POWERPC/nwchem job.nw >& job.out