IAC: Difference between revisions
| Line 22: | Line 22: | ||
== User details == | == User details == | ||
Appendix G: Commands Overview | |||
Command Description | |||
checkjob provide detailed status report for specified job | |||
checknode provide detailed status report for specified node | |||
mcredctl controls various aspects about the credential objects within Moab | |||
mdiag provide diagnostic reports for resources, workload, and scheduling | |||
mjobctl control and modify job | |||
mnodectl control and modify nodes | |||
mrmctl query and control resource managers | |||
mrsvctl create, control and modify reservations | |||
mschedctl modify scheduler state and behavior | |||
mshow displays various diagnostic messages about the system and job queues | |||
mshow -a query and show available system resources | |||
msub scheduler job submission | |||
resetstats reset scheduler statistics | |||
setspri adjust job/system priority of job | |||
showbf show current resource availability | |||
showq show queued jobs | |||
showres show existing reservations | |||
showstart show estimates of when job can/will start | |||
showstate show current state of resources | |||
showstats show usage statistics | |||
showstats -f show various tables of scheduling/system performance | |||
Commands Providing Maui Compatibility | |||
Command Description | |||
canceljob cancel job | |||
changeparam change in memory parameter settings | |||
diagnose provide diagnostic report for various aspects of resources, workload, and scheduling | |||
releasehold release job defers and holds | |||
releaseres release reservations | |||
runjob force a job to run immediately | |||
sethold set job holds | |||
setqos modify job QOS settings | |||
setres set an admin/user reservation | |||
showconfig show current scheduler configuration | |||
== Queues == | == Queues == | ||
Revision as of 13:41, 1 February 2008
IAC stands for "Instituto de Astrofísica de Canarias" and hosts the LaPalma supercomputer. LaPalma comprises 256 JS20 compute nodes (blades) and 4 p510 servers. Every blade has two processors at 2.2 GHz running Linux operating system with 4 GB of memory RAM and 40 GB local disk storage. All the servers provide a total of 14 TB of disk storage accessible from every blade through GPFS (Global Parallel File System).
The networks that interconnect the LaPalma are:
- Myrinet Network: High bandwidth network used by parallel applications communications.
- Gigabit Network: Ethernet network used by the blades to mount remotely their root file system from the servers and the network over which GPFS works.
Access
There are 3 login blades:
ssh username@lapalma1.iac.es
ssh username@lapalma2.iac.es
ssh username@lapalma3.iac.es
User details
Appendix G: Commands Overview Command Description checkjob provide detailed status report for specified job checknode provide detailed status report for specified node mcredctl controls various aspects about the credential objects within Moab mdiag provide diagnostic reports for resources, workload, and scheduling mjobctl control and modify job mnodectl control and modify nodes mrmctl query and control resource managers mrsvctl create, control and modify reservations mschedctl modify scheduler state and behavior mshow displays various diagnostic messages about the system and job queues mshow -a query and show available system resources msub scheduler job submission resetstats reset scheduler statistics setspri adjust job/system priority of job showbf show current resource availability showq show queued jobs showres show existing reservations showstart show estimates of when job can/will start showstate show current state of resources showstats show usage statistics showstats -f show various tables of scheduling/system performance Commands Providing Maui Compatibility Command Description canceljob cancel job changeparam change in memory parameter settings diagnose provide diagnostic report for various aspects of resources, workload, and scheduling releasehold release job defers and holds releaseres release reservations runjob force a job to run immediately sethold set job holds setqos modify job QOS settings setres set an admin/user reservation showconfig show current scheduler configuration
Queues
"Slurm+ MOAB" (essential information collected from the User's Guide)
% mnsubmit <job_script>
submits a “job script” to the queue system (see below a script example).
% mnq
shows all the jobs submitted.
% mncancel <job_id>
remove his/her job from the queue system, canceling the execution of the processes, if they were already running.
% checkjob <job_id>
obtains detailed information about a specific job, including the assigned nodes and the possible reasons preventing the job from running.
% mnstart <job_id>
shows information about the estimated time for the specified job to be executed.
Avaliable Programs
Submission script
#!/bin/bash # @ job_name = job # @ initialdir = . # @ output = OUTPUT/mpi_%j.out # @ error = OUTPUT/mpi_%j.err # @ total_tasks = 64 # @ wall_clock_limit = 12:00:00 srun /gpfs/apps/NWCHEM/4.7/bin/LINUX64_POWERPC/nwchem job.nw >& job.out