CESM2 on FRAM: Difference between revisions

From Norcpm
(Created page with "== 1 Prerequisites == If you have not obtained access to HEXAGON, apply at https://skjemaker.app.uib.no/view.php?id=2901837 You need to be member of the unix group hpcnore...")
 
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
== 1 Prerequisites ==   
== 1 Prerequisites ==   


If you have not obtained access to HEXAGON, apply at https://skjemaker.app.uib.no/view.php?id=2901837
If you have not obtained access to FRAM, apply at https://www.metacenter.no/user/application/form/notur (project nn9625k).  
 
You need to be member of the unix group hpcnoresm to be able to access the code and input data in /shared/projects/noresm. Write to support@hpc.uib.no (with cc to ingo.bethke@norceresearch.no) to become member of the group.  
 
   
   
== 2 Installing the model ==
== 2 Installing the model ==
Line 10: Line 7:
Install the model in your home directory with
Install the model in your home directory with
   cd   
   cd   
   tar xvf /shared/projects/noresm/models/CESM2.0_latest.tgz
   tar xvf /cluster/projects/nn9625k/cesm/cesm2.1.0_latest.tgz
 


== 3 Aqua-planet with slab-ocean and thermodynamic sea ice ==
== 3 Aqua-planet with slab-ocean and thermodynamic sea ice ==


=== 3.1 Load python module ===
=== 3.1 Create a new case ===


CESM2's script environment is python based. The model requires python version 2.7 or newer, which has to be loaded on HEXAGON with
Change to CESM2's scripts directory with
  module load python/2.7.2-dso
   cd $HOME/cesm2.0.1/cime/scripts  
 
For more information on loading modules on HEXAGON, visit https://docs.hpc.uib.no/wiki/Application_development_(Hexagon)
 
=== 3.2 Create a new case ===
 
Change to CESM2.0's scripts directory with
   cd $HOME/CESM2.0/cime/scripts  


Create an aqua-planet case with  
Create an aqua-planet case with  
   ./create_newcase --case $HOME/CESM2.0/cases/QSIC5_f09_f09_test_01 --compset QSIC5 --res f09_f09_mg17 --machine hexagon --pecount M --run-unsupported
   ./create_newcase --case $HOME/cesm2.0.1/cases/QSIC5_f09_f09_test_01 --compset QSIC5 --res f09_f09_mg17 --machine fram --pecount S --project nn9625k --run-unsupported


This will create a case directory in $HOME/CESM2.0/cases/QSIC5_f09_f09_test_01 (feel free to choose a different name and location).  
This will create a case directory in $HOME/cesm2.0.1/cases/QSIC5_f09_f09_test_01 (feel free to choose a different name and location).  


The rest of this section further explains the above choice of options. Type "./create_newcase --help" for detailed information of all possible options.
The rest of this section further explains the above choice of options. Type "./create_newcase --help" for detailed information of all possible options.


==== 3.2.1 Choosing component set with --compset option ====
==== 3.1.1 Choosing component set with --compset option ====


The case uses the predefined aqua-planet (Q) component set QSIC5 that uses a slab-ocean (S), thermodynamic sea ice (I) coupled to a 30-layer configuration of CAM5 (C5).  
The case uses the predefined aqua-planet (Q) component set QSIC5 that uses a slab-ocean (S), thermodynamic sea ice (I) coupled to a 30-layer configuration of CAM5 (C5).  


==== 3.2.2 Choosing resolution with --res option ====
==== 3.1.2 Choosing resolution with --res option ====


In combination with the component set QSIC5, the horizontal resolution configuration f09_f09_mg17 specifies NCAR's 0.9x1.25 finite-volume lonlat grid for all active components.   
In combination with the component set QSIC5, the horizontal resolution configuration f09_f09_mg17 specifies NCAR's 0.9x1.25 finite-volume lonlat grid for all active components.   


The grid converges towards the poles and is therefore not suitable for use with dynamic sea ice. Sea ice dynamics are hence deactivated in QSIC5 setting kdyn=0 in the sea ice component's namelist.   
The grid converges towards the poles and is therefore not suitable for use with dynamic sea ice. Sea ice dynamics are hence deactivated in QSIC5 setting kdyn=0 in the sea ice component's namelist. Furthermore, sea ice is tuned towards less ice by setting r_snw=-2.0.   


==== 3.2.3 Choosing a predefined cpu-configuration with --pecount option ====
==== 3.1.3 Choosing a predefined cpu-configuration with --pecount option ====


For QSIC5 on 0.9x1.25, possible arguments of --pecount are S, M (default if --pecount is omitted), L, X1 and X2.  
For QSIC5 on 0.9x1.25, possible arguments of --pecount are S, M (default if --pecount is omitted), L, X1 and X2.  
Line 50: Line 39:
The corresponding number of cores are 192 (S), 320 (M), 384 (L), 640 (X1) and 960 (X2).  
The corresponding number of cores are 192 (S), 320 (M), 384 (L), 640 (X1) and 960 (X2).  


The user maximum is 1024 cores on HEXAGON. This allows concurrent integration of e.g. 1 X2, or 1 L + 1 X1, or 2 M + 1 L, or 2 M + 2 S, or 5 S. 
=== 3.2 Set up case ===
 
The M default produces a throughput of 3.1 sim-years per integration day.
 
=== 3.3 Set up case ===


Change to case directory with
Change to case directory with
   cd $HOME/CESM2.0/cases/QSIC5_f09_f09_test_01  
   cd $HOME/cesm2.1.0/cases/QSIC5_f09_f09_test_01  


Execute case-setup script with  
Execute case-setup script with  
   ./case.setup  
   ./case.setup  


This will create build and job scripts under the case directory and also prepare the run-directory in /work/users/$USER/noresm/QSIC5_f09_f09_test_01/run.  
This will create build and job scripts under the case directory and also prepare the run-directory in /cluster/work/users/$USER/cesm/QSIC5_f09_f09_test_01/run.  


=== 3.4 Customize component namelist files ===
=== 3.3 Customize component namelist files ===


The case directory contains the four user namelist files user_nl_cam, user_nl_cice, user_nl_cpl and user_nl_docn, which can be customized e.g. to specify additional diagnostic output.  
The case directory contains the four user namelist files user_nl_cam, user_nl_cice, user_nl_cpl and user_nl_docn, which can be customized e.g. to specify additional diagnostic output.  
Line 73: Line 58:
This will update the _in-namelist files in the run-directory so one can review the updates. However, use of preview_namelists is not necessary as the namelists are updated on job submission.  
This will update the _in-namelist files in the run-directory so one can review the updates. However, use of preview_namelists is not necessary as the namelists are updated on job submission.  


=== 3.5 Build model ===
=== 3.4 Build model ===


In case directory, execute  
In case directory, execute  
Line 80: Line 65:
This will build the model and perform other tasks in the run directory.  
This will build the model and perform other tasks in the run directory.  


=== 3.6 Run time options (run length, resubmission etc) ===
=== 3.5 Run time options (run length, resubmission etc) ===


Length of the integration and similar can be customized in the file env_run.xml in the case directory.
Length of the integration and similar can be customized in the file env_run.xml in the case directory.
Line 86: Line 71:
One can either edit env_run.xml using an editor (e.g. vi) or from the command line using the xmlchange script available in the case directory (type "./xmlchange --help" for usage).
One can either edit env_run.xml using an editor (e.g. vi) or from the command line using the xmlchange script available in the case directory (type "./xmlchange --help" for usage).


==== 3.6.1 Initial versus continuation  ====
==== 3.5.1 Initial versus continuation  ====


At the beginning of the simulation the value of CONTINUE_RUN must be set to FALSE. If set to true then the model attempts to restart from restart conditions produced by the SAME simulation that must be present in the run directory.  
At the beginning of the simulation the value of CONTINUE_RUN must be set to FALSE. If set to true then the model attempts to restart from restart conditions produced by the SAME simulation that must be present in the run directory.  
Line 92: Line 77:
If you want to continue a simulation then make sure that CONTINUE_RUN is set to TRUE.  
If you want to continue a simulation then make sure that CONTINUE_RUN is set to TRUE.  


==== 3.6.2 Run length ====
==== 3.5.2 Run length ====


The run length is specified by STOP_OPTION and STOP_N. The default is 5 days. Set STOP_OPTION to nyears and STOP_N to 1 to specify a run length of 1 year.
  
The run length is specified by STOP_OPTION and STOP_N. The default is 5 days. Set STOP_OPTION to nyears and STOP_N to 1 to specify a run length of 1 year.
  


==== 3.6.3 Automatic resubmission and short term archiving  ====
==== 3.5.3 Automatic resubmission and short term archiving  ====


If RESUBMIT is set to n>0 then the job is automatically resubmitted n times. By default, CONTINUE_RUN will automatically set to TRUE during resubmission.
If RESUBMIT is set to n>0 then the job is automatically resubmitted n times. By default, CONTINUE_RUN will automatically set to TRUE during resubmission.


After each integration, diagnostic output and restart information is moved from the run-directory to the short-term archiving location /work/users/$USER/archive/QSIC5_f09_f09_test_01
After each integration, diagnostic output and restart information is moved from the run-directory to the short-term archiving location /cluster/work/users/$USER/archive/QSIC5_f09_f09_test_01
 
Note that the /work/users disk area is subject to automatic deletion (see https://docs.hpc.uib.no/wiki/Data_Handling_and_Storage_Policy#Scratch_area:_.2Fwork ). Once the simulation is completed it is therefore recommended to move the output to a different location.


=== 3.7 Integration time and job submission ===
=== 3.6 Integration time and job submission ===


==== 3.7.1 Setting integration time ====
==== 3.6.1 Setting integration time ====


The value of JOB_WALLCLOCK_TIME in env_batch.xml specifies the maximum integration time.  
The value of JOB_WALLCLOCK_TIME in env_batch.xml specifies the maximum integration time.  


The default for is set to 4 days on HEXAGON. If the machine load is high then a specification of a shorter wall-clock time will result in shorter queuing time.  
For testing purpose, the default for is set to 1 hour on FRAM. If the machine load is high then a specification of a short wall-clock time will result in shorter queuing time.  


The specified time corresponds to the limit for a single resubmission. For example, if the model resubmits itself after each simulation year then choose a wall-clock time sufficient to run one simulation year.  
The specified time corresponds to the limit for a single resubmission. For example, if the model resubmits itself after each simulation year then choose a wall-clock time sufficient to run one simulation year.  
   
   
==== 3.7.2 Job submission ====
==== 3.6.2 Job submission ====


To submit the job execute  
To submit the job execute  
Line 120: Line 103:


Once the job starts to run, the model will write log files and output in the run directory  
Once the job starts to run, the model will write log files and output in the run directory  
   /work/users/$USER/noresm/QSIC5_f09_f09_test_01/run  
   /cluster/work/users/$USER/cesm/QSIC5_f09_f09_test_01/run  


To check the queuing status type  
To check the queuing status type  
Line 129: Line 112:
where <job id> is obtained from squeue.
where <job id> is obtained from squeue.


More information on the queuing system is found at https://docs.hpc.uib.no/wiki/Job_execution_(Hexagon)
More information on the queuing system is found at https://documentation.sigma2.no/jobs/jobscripts.html

Latest revision as of 13:51, 15 January 2019

1 Prerequisites

If you have not obtained access to FRAM, apply at https://www.metacenter.no/user/application/form/notur (project nn9625k).

2 Installing the model

Install the model in your home directory with

 cd  
 tar xvf /cluster/projects/nn9625k/cesm/cesm2.1.0_latest.tgz

3 Aqua-planet with slab-ocean and thermodynamic sea ice

3.1 Create a new case

Change to CESM2's scripts directory with

 cd $HOME/cesm2.0.1/cime/scripts 

Create an aqua-planet case with

 ./create_newcase --case $HOME/cesm2.0.1/cases/QSIC5_f09_f09_test_01 --compset QSIC5 --res f09_f09_mg17 --machine fram --pecount S --project nn9625k --run-unsupported

This will create a case directory in $HOME/cesm2.0.1/cases/QSIC5_f09_f09_test_01 (feel free to choose a different name and location).

The rest of this section further explains the above choice of options. Type "./create_newcase --help" for detailed information of all possible options.

3.1.1 Choosing component set with --compset option

The case uses the predefined aqua-planet (Q) component set QSIC5 that uses a slab-ocean (S), thermodynamic sea ice (I) coupled to a 30-layer configuration of CAM5 (C5).

3.1.2 Choosing resolution with --res option

In combination with the component set QSIC5, the horizontal resolution configuration f09_f09_mg17 specifies NCAR's 0.9x1.25 finite-volume lonlat grid for all active components.

The grid converges towards the poles and is therefore not suitable for use with dynamic sea ice. Sea ice dynamics are hence deactivated in QSIC5 setting kdyn=0 in the sea ice component's namelist. Furthermore, sea ice is tuned towards less ice by setting r_snw=-2.0.

3.1.3 Choosing a predefined cpu-configuration with --pecount option

For QSIC5 on 0.9x1.25, possible arguments of --pecount are S, M (default if --pecount is omitted), L, X1 and X2.

The corresponding number of cores are 192 (S), 320 (M), 384 (L), 640 (X1) and 960 (X2).

3.2 Set up case

Change to case directory with

 cd $HOME/cesm2.1.0/cases/QSIC5_f09_f09_test_01 

Execute case-setup script with

 ./case.setup 

This will create build and job scripts under the case directory and also prepare the run-directory in /cluster/work/users/$USER/cesm/QSIC5_f09_f09_test_01/run.

3.3 Customize component namelist files

The case directory contains the four user namelist files user_nl_cam, user_nl_cice, user_nl_cpl and user_nl_docn, which can be customized e.g. to specify additional diagnostic output.

After changing the user namelist files, you can optionally execute

 ./preview_namelists  

This will update the _in-namelist files in the run-directory so one can review the updates. However, use of preview_namelists is not necessary as the namelists are updated on job submission.

3.4 Build model

In case directory, execute

 ./case.build 

This will build the model and perform other tasks in the run directory.

3.5 Run time options (run length, resubmission etc)

Length of the integration and similar can be customized in the file env_run.xml in the case directory.

One can either edit env_run.xml using an editor (e.g. vi) or from the command line using the xmlchange script available in the case directory (type "./xmlchange --help" for usage).

3.5.1 Initial versus continuation

At the beginning of the simulation the value of CONTINUE_RUN must be set to FALSE. If set to true then the model attempts to restart from restart conditions produced by the SAME simulation that must be present in the run directory.

If you want to continue a simulation then make sure that CONTINUE_RUN is set to TRUE.

3.5.2 Run length

The run length is specified by STOP_OPTION and STOP_N. The default is 5 days. Set STOP_OPTION to nyears and STOP_N to 1 to specify a run length of 1 year.


3.5.3 Automatic resubmission and short term archiving

If RESUBMIT is set to n>0 then the job is automatically resubmitted n times. By default, CONTINUE_RUN will automatically set to TRUE during resubmission.

After each integration, diagnostic output and restart information is moved from the run-directory to the short-term archiving location /cluster/work/users/$USER/archive/QSIC5_f09_f09_test_01

3.6 Integration time and job submission

3.6.1 Setting integration time

The value of JOB_WALLCLOCK_TIME in env_batch.xml specifies the maximum integration time.

For testing purpose, the default for is set to 1 hour on FRAM. If the machine load is high then a specification of a short wall-clock time will result in shorter queuing time.

The specified time corresponds to the limit for a single resubmission. For example, if the model resubmits itself after each simulation year then choose a wall-clock time sufficient to run one simulation year.

3.6.2 Job submission

To submit the job execute

 ./case.submit 

Once the job starts to run, the model will write log files and output in the run directory

 /cluster/work/users/$USER/cesm/QSIC5_f09_f09_test_01/run 

To check the queuing status type

 squeue -u $USER 

To cancel a job use

 scancel <job id> 

where <job id> is obtained from squeue.

More information on the queuing system is found at https://documentation.sigma2.no/jobs/jobscripts.html