IAC: Difference between revisions
| Line 25: | Line 25: | ||
== Queues == | == Queues == | ||
"[https://computing.llnl.gov/linux/slurm/ Slurm]+MOAB" (essential information collected from the User's Guide) | "[https://computing.llnl.gov/linux/slurm/ Slurm]+ [http://www.clusterresources.com/pages/products/moab-cluster-suite.php MOAB]" (essential information collected from the User's Guide) | ||
% '''mnsubmit <job_script>''' | % '''mnsubmit <job_script>''' | ||
Revision as of 12:45, 1 February 2008
IAC stands for "Instituto de Astrofísica de Canarias" and hosts the LaPalma supercomputer. LaPalma comprises 256 JS20 compute nodes (blades) and 4 p510 servers. Every blade has two processors at 2.2 GHz running Linux operating system with 4 GB of memory RAM and 40 GB local disk storage. All the servers provide a total of 14 TB of disk storage accessible from every blade through GPFS (Global Parallel File System).
The networks that interconnect the LaPalma are:
- Myrinet Network: High bandwidth network used by parallel applications communications.
- Gigabit Network: Ethernet network used by the blades to mount remotely their root file system from the servers and the network over which GPFS works.
Access
There are 3 login blades:
ssh username@lapalma1.iac.es
ssh username@lapalma2.iac.es
ssh username@lapalma3.iac.es
User details
Queues
"Slurm+ MOAB" (essential information collected from the User's Guide)
% mnsubmit <job_script>
submits a “job script” to the queue system (see below a script example).
% mnq
shows all the jobs submitted.
% mncancel <job_id>
remove his/her job from the queue system, canceling the execution of the processes, if they were already running.
% checkjob <job_id>
obtains detailed information about a specific job, including the assigned nodes and the possible reasons preventing the job from running.
% mnstart <job_id>
shows information about the estimated time for the specified job to be executed.
Avaliable Programs
Submission script
#!/bin/bash # @ job_name = job # @ initialdir = . # @ output = OUTPUT/mpi_%j.out # @ error = OUTPUT/mpi_%j.err # @ total_tasks = 64 # @ wall_clock_limit = 12:00:00 srun /gpfs/apps/NWCHEM/4.7/bin/LINUX64_POWERPC/nwchem job.nw >& job.out