diff --git a/slurm/20190411.Slurm-accounting.md b/slurm/20190411.Slurm-accounting.md index 5db10f2..47d6e0c 100644 --- a/slurm/20190411.Slurm-accounting.md +++ b/slurm/20190411.Slurm-accounting.md @@ -1,7 +1,10 @@ SLURM ACCOUNTING (sacct) ======================== -CAVEAT: +Created: April 2019
+Updated: November 2019 + +**CAVEAT:** This document was originally developed by referencing SLURM 18.08.1 used on Turing. I also tried to consult the newer version (master branch @@ -11,6 +14,12 @@ incompatible with this version. Please use a grain of salt when reading, and always consult with manual pages, source code, etc in case of doubt. +*Update 2019-11-06*: +SLURM man page now contains the description of the accounting fields. +Please look at + . + + UNDERSTANDING SLURM ACCOUNTING FIELDS ------------------------------------- @@ -21,7 +30,10 @@ SLURM accounting can produce very many fields. The "cooked" job ID. Please see the discussion below. `JobIDRaw`: -The "raw" job ID. Please see the discussion below. +The "raw" job ID. +In a vast majority of cases, the `JobIDRaw` field is identical to `JobID` +except in the case of array jobs. +Please see the discussion below. `TimelimitRaw`: The raw value of time limit, in minutes. @@ -76,7 +88,7 @@ A `JOBSTEP` can have several subtypes: * `SLURM_BATCH_SCRIPT`, in which case JobIDRaw will obtain the `.batch` suffix. * `SLURM_EXTERN_CONT`, in which case JobIDRaw will obtain the `.extern` suffix. Apparently, this is meant to indicate "external" type of job steps, - including. + described further below. * many others; but in this case, it will print JobIDRaw in `[0-9]+\.[0-9]+` pattern * Other types (usually it will have index numbers like 0, 1, 2, ...) @@ -87,7 +99,7 @@ A `JOBSTEP` can have several subtypes: A "vanilla" job entry corresponds to a single job submitted by a user to SLURM. This will not be a job array. -* Characteristics : `JobID ~ /^[0-9]+$/`. +* Regexp match : `JobID ~ /^[0-9]+$/`. #### Array Job @@ -95,7 +107,7 @@ This will not be a job array. An "array" job entry corresponds to a single job as part of a job array submitted by a user to SLURM. -* Characteristics : `JobID ~ /^[0-9]+_[0-9]+$/`. +* Regexp match : `JobID ~ /^[0-9]+_[0-9]+$/`. The Job ID contains two numbers separated by an underscore. The number before the underscore refers to the job ID as reported by @@ -114,7 +126,7 @@ square brackets around the job suffix: A heterogenous job entry corresponds to a part of a heterogenous job submitted by a user to SLURM. -* Characteristics : `JobID ~ /^[0-9]+\+[0-9]+$/`. +* Regexp match: `JobID ~ /^[0-9]+\+[0-9]+$/`. The Job ID contains two numbers separated by a plus sign. The number before the underscore refers to the job ID as reported by @@ -130,7 +142,7 @@ sbatch) when more than one CPU cores were requested by the job. Characteristics of SLURM_BATCH_SCRIPT accounting records: -* JobIDRaw =~ /^[0-9]+\.batch$/ +* Regexp match: `JobIDRaw ~ /^[0-9]+\.batch$/` * The record does NOT have user ID (field `User`) @@ -177,7 +189,7 @@ that may be only when a specific "job completion" task is specified. #### Questions & (Possible) Answers * Why there is a separate "NNNNN.batch" record? - It is perhaps when the job is multi-node. + Perhaps, this record was made when the job is multi-node. It appears to me that the ".batch" record is for accounting the batch script itself (which will run only on node #0 of the allocated resources). @@ -192,7 +204,7 @@ This is what I found after this exploration: > We only need to include accounting records where the `JobIDRaw` field > contains only whole integers (i.e. matching regex `^[0-9]+$`). - +> Further, ## References