IAC: Difference between revisions

From Wiki
Jump to navigation Jump to search
Line 22: Line 22:


== User details ==
== User details ==
Appendix G:  Commands Overview
Command Description
checkjob provide detailed status report for specified job
checknode provide detailed status report for specified node
mcredctl controls various aspects about the credential objects within Moab
mdiag provide diagnostic reports for resources, workload, and scheduling
mjobctl control and modify job
mnodectl control and modify nodes
mrmctl query and control resource managers
mrsvctl create, control and modify reservations
mschedctl modify scheduler state and behavior
mshow displays various diagnostic messages about the system and job queues
mshow -a query and show available system resources
msub scheduler job submission
resetstats reset scheduler statistics
setspri adjust job/system priority of job
showbf show current resource availability
showq show queued jobs
showres show existing reservations
showstart show estimates of when job can/will start
showstate show current state of resources
showstats show usage statistics
showstats -f show various tables of scheduling/system performance
Commands Providing Maui Compatibility
Command Description
canceljob cancel job
changeparam change in memory parameter settings
diagnose provide diagnostic report for various aspects of resources, workload, and scheduling
releasehold release job defers and holds
releaseres release reservations
runjob force a job to run immediately
sethold set job holds
setqos modify job QOS settings
setres set an admin/user reservation
showconfig show current scheduler configuration


== Queues ==
== Queues ==

Revision as of 13:41, 1 February 2008

IAC stands for "Instituto de Astrofísica de Canarias" and hosts the LaPalma supercomputer. LaPalma comprises 256 JS20 compute nodes (blades) and 4 p510 servers. Every blade has two processors at 2.2 GHz running Linux operating system with 4 GB of memory RAM and 40 GB local disk storage. All the servers provide a total of 14 TB of disk storage accessible from every blade through GPFS (Global Parallel File System).

The networks that interconnect the LaPalma are:

- Myrinet Network: High bandwidth network used by parallel applications communications.

- Gigabit Network: Ethernet network used by the blades to mount remotely their root file system from the servers and the network over which GPFS works.

Access

There are 3 login blades:

ssh username@lapalma1.iac.es

ssh username@lapalma2.iac.es

ssh username@lapalma3.iac.es

User details

Appendix G: Commands Overview Command Description checkjob provide detailed status report for specified job checknode provide detailed status report for specified node mcredctl controls various aspects about the credential objects within Moab mdiag provide diagnostic reports for resources, workload, and scheduling mjobctl control and modify job mnodectl control and modify nodes mrmctl query and control resource managers mrsvctl create, control and modify reservations mschedctl modify scheduler state and behavior mshow displays various diagnostic messages about the system and job queues mshow -a query and show available system resources msub scheduler job submission resetstats reset scheduler statistics setspri adjust job/system priority of job showbf show current resource availability showq show queued jobs showres show existing reservations showstart show estimates of when job can/will start showstate show current state of resources showstats show usage statistics showstats -f show various tables of scheduling/system performance Commands Providing Maui Compatibility Command Description canceljob cancel job changeparam change in memory parameter settings diagnose provide diagnostic report for various aspects of resources, workload, and scheduling releasehold release job defers and holds releaseres release reservations runjob force a job to run immediately sethold set job holds setqos modify job QOS settings setres set an admin/user reservation showconfig show current scheduler configuration

Queues

"Slurm+ MOAB" (essential information collected from the User's Guide)

% mnsubmit <job_script>

submits a “job script” to the queue system (see below a script example).

% mnq

shows all the jobs submitted.

% mncancel <job_id>

remove his/her job from the queue system, canceling the execution of the processes, if they were already running.

% checkjob <job_id>

obtains detailed information about a specific job, including the assigned nodes and the possible reasons preventing the job from running.

% mnstart <job_id>

shows information about the estimated time for the specified job to be executed.

Avaliable Programs

NWChem

DLPOLY 2


Submission script

#!/bin/bash
# @ job_name      = job
# @ initialdir    = .
# @ output        = OUTPUT/mpi_%j.out
# @ error         = OUTPUT/mpi_%j.err
# @ total_tasks   = 64
# @ wall_clock_limit = 12:00:00

srun /gpfs/apps/NWCHEM/4.7/bin/LINUX64_POWERPC/nwchem job.nw >& job.out


Links

IAC Website