NorCPM User Manual
PLEASE READ THIS BEFORE EDITING
Overview
The Norwegian Climate Prediction Model (NorCPM) is aiming at providing prediction from seasonal-to-decadal time scale. It is based on the Norwegian Earth System Model (NorESM, [1]) and the Ensemble Kalman Filter (EnKF, [2]) data assimilation method. NorESM is a state of the art Earth system model that is based on CESM ([3]), but uses different aerosol/chemistry scheme and ocean model (evolved from MICOM). The EnKF is a sequential data assimilation method that allows for fully multivariate and flow dependent correct using a covariance matrix procuded by a Monte-Carlo ensemble integration. Currently the system only intend to update the ocean part as this is where most of the predictability is expected, but additional atmospheric nudging and assimilation of land variables are also considered.
Norwegian Earth System Model
The Norwegian Earth System Model (NorESM) is one out of ~20 climate models that has produced output for the CMIP5 (http://cmip-pcmdi.llnl.gov/cmip5). The NorESM-family of models are based on the Community Climate System Model version 4 (CCSM4) of the University Corporation for Atmospheric Research, but differs from the latter by, in particular, an isopycnic coordinate ocean model and advanced chemistry-aerosol-cloud-radiation interaction schemes. The main version NorESM1-M has a horizontal resolution of approximately 2deg for the atmosphere and land components and 1deg for the ocean and ice components. NorESM is also available in a lower resolution version (NorESM1-L), a medium-low resolution version (NorESM1-ML), a high-top version with specified and full chemistry (NorESM1-MLHT and NorESM1-MLHTC) and a version that includes prognostic biogeochemical cycling (NorESM1-ME).
Model acronym | Ocean | Atmosphere | References |
---|---|---|---|
NorESM1-L | Micom (3.6deg) | CAM4 (T31) | Zhang et al. 2012, Counillon et al. 2014 |
NorESM1-ML | Micom (2deg) | CAM4 (2deg) | |
NorESM1-MLHT | Micom (2deg) | CAM4-WACCMSC (2deg) | |
NorESM1-MLHTC | Micom (2deg) | CAM4-WACCM (2deg) | |
NorESM1-ME | Micom (1deg) | CAM4-OSLO (2deg) | Tjiputra et al. 2013 |
Ensemble Kalman Filter
The EnKF is a sequential ensemble based data assimilation method that consists of two steps, a propagation and a correction. The propagation step is a Monte Carlo method. The ensemble spread (i.e. ensemble variability) is used to estimate the forecast error, because they are expected to be related in locations (and times) where (and when) the system is more chaotic. Assuming that the distribution of the error is Gaussian and the model is not biased one can proceed with the tBayesian update and find new estimate of the ensemble mean and model covariance. The method is often called as flow dependent as the covariance matrix evolves with the system and thus provide correction that are in agreement with the state of the system. This property is quite valuable for our application because such framework is very suitable for predicting extreme event or for system that are drifting with time. The method also allows for fully multivariate updated, meaning that observation (of for example SST) can be used to apply correction on all other model variables. This is also a valuable properties because observation are very sparse in space and time. However one should bear in mind that the update assume linearity, which is not suitable for all variable and that correlation are subject to sampling error. Currently NorCPM uses the Deterministic Ensemble Kalman Filter (DEnKF, Sakov et al. 2008), which is a square root filter version of the EnKF.
Getting started with NorCPM
Prerequisites
User-support for NorCPM is currently limited to Norway.
Step 1: New users need to apply for access to computational and storage resources at the Norwegian Metacenter for Computational Science (link to application page: https://www.notur.no/user-account). NorCPM activities are usually tied to the cpu and storage accounts nn9039k and ns9039k, which are held by Noel Keenlyside (noel.keenlyside[at]gfi.uib.no). NorCPM is currently set up on the computational platform HEXAGON (https://www.notur.no/hardware/hexagon).
Step 2: After gaining access to HEXAGON, the user needs to contact the local support (support-uib@notur.no) to be added to the unix-groups "noresm" and "nn9039k".
Installing compiling NorCPM
Before installing NorCPM you must be sure that you don't have an old set up already existing. The following script should stop you from completing a new installation if there is a conflicting one. However it is a good habit to cleanup first your experiment. You can go to the following Section Cleanup Ensemble to get a description of how to do that.
On hexagon, follow the step:
1) install NorESM and link the script necessary :
- cd ${HOME}
- mkdir -p NorESM
- cd NorESM
If you have a NoreSM svn access do:
- svn checkout https://svn.met.no/NorESM/noresm/tags/projectEPOCASA-3 projectEPOCASA-3
if you don't:
- tar -xvf /work-common/shared/nn9039k/NorCPM/Code/NorESM/projectEPOCASA-3.tar.gz
- mkdir -p Script
In order to use the default Script version
- ln -s /work/shared/nn9039k/NorCPM/Script/* .
- rm personal_setting.sh
- cp /work/shared/nn9039k/NorCPM/Script/personal_setting.sh .
- cd ${HOME}/NorESM/
- mkdir -p bin
- cd bin
Same with bin, for using the default version link the file. If you want to create your own, Copy and compile the code in /work-common/shared/nn9039k/NorCPM/Code/EnKF/ delete the link in bin and move your own executable there
- ln -sf /work/shared/nn9039k/NorCPM/bin/* .
2) Select of a model version and experiment :
Need to edit ${HOME}/Script/personal_setting.sh to chose a model version, ensemble size, starting date, ...
Launch the creation of the ensemble structure.
- cd ${HOME}/NorESM/Script
- ./create_ensemble.sh
Now, your structure of ensemble should be created, the code compiled and the initial condition copied. You are ready to start your reanalysis.
Cleanup Ensemble
There is many situation where it is necessary to restart from scratch. Before you start clean up everything, make sure that you have backup everything you needed. The most obvious is if you want to change model version or that you want to use the same casename for a different model set up. You may also want to free some space on your home directory. In all this cases you can delete all file that you have in your work and HOME dire by running the following command. Of course if you have never run NorCPM before you don't need to do the following. To clean up your ensemble structure:
- ${HOME}/NorESM/Script/cleanup_ensemble.sh
This clean member 2--ENSSIZE
- ${HOME}/NorESM/Script/cleanup_single_mem.sh 1
this would delete member 1
Model and directory structure
Shared files on hexagon are located here:
/work/shared/nn9039k/NorCPM/
The subfolder:
- -Code contains source code of all fortran code needed (NorESM, EnKF, Post processing)
- -Script contains all bash script necessary to run the reanalysis or prediction
- -Restart contains the initial condition (restart files) for two different configuration of NorESM in 1980-01-15
- -Obs contains observation that are available for assimilation (SST,SSH)
- -bin contains compiled executable from the Code subfolder
- -Input contains input files both for NorESM and EnKF
- -matlab contains code used for validation purpose
In your home folder ${HOME}/NorESM/ you have the folder: bin, Script and projectEPOCASA-3 that are copied or linked from the /work/shared
The folder cases contains setting of your ensemble of experiment. Each ensemble members have its own separate experiment subfolder (duplicate files are linked to reduce the disk usage).
How to start your ensemble from a non pre-existing initial ensemble
Nowadays, you have two existing set up: one with NorCPM_ME and one with NorCPM_F19_tn21 both which are both starting on the 1980-01-15. This initial restart are in /work/shared and are automatically retrieved by create_esemble.sh. If you want to start from a different start date, this is slightly more complicated. There are currently two different strategy implemented:
- The first is to restart from an existing ensemble (historical or preindustriel). NorESM beeing very rigid you won't be able to directly restart from the ensemble is there is any slight difference in the model version, in the file name or in dianostic of the output. You need to use a hybrid start. In personal_setting.sh you should have set hybrid_run=1 and ens_start=1. You need to specify the existing ens_casename and wished start date.
- The second option is an historical start. It is possible to start your ensemble from a single run. It is recommended to use a run that is run with pre-industrial foxing, otherwise your initial ensemble may contain spurious correlation induced by the drift of the system with historical anthropogenic forcing. To run this option, set hybrid_run=1, ens_start=0, hist_start=1, hist_path, hist_start_date and hist_freq_date. This is not tested in the new script system and most of the code is commented out, but it should be only minor edit to update the code. Contact us if you have question. The best approach is then to run your ensemble with historical forcing up to the wished starting date.
In the current NorCPM we are assimilating data at monthly frequency. As data is centered in the middle of the month, it is best to assimilate data in the middle of the months. When you are making a hybrid start from an run, the time stamp of the restart would likely be the beginning of a year/month. One need to integrate the ensemble for half a month before starting the system. To do so you must change running frequency to 14 days in ens_run.xml. For doing so, you can use the script : Integrate_ens_half-month.sh
When the integration is finished you need to set the frequency back to monthly. this can be done with change_running_freq_4_analysis.sh
Option available in NorCPM (in personal_setting.sh)
Most of the setting are set in the file ${HOME}/NorESM/personal_setting.sh
There is currently two versions of NorESM available: NorCPM_F19_tn21 and NorCPM_ME
- -NorCPM_F19_tn21 is the default version: It uses CAM5 at a resolution of F19 (about 2 degree) for the atmosphere and 2 degree for the ocean on a tripolar grid.
- -NorCPM_ME (or f19_g16) is the medium resolution used for CMIP5: It uses CAM-OSLO at a resolution of F19 for the atmosphere (about 2 dergree resolution) and about 1 degree for the ocean (bipolar grid).
Assimilation of multiple type of observation is not yet available. About the type of observation available is OBSTYPE=SST and PRODUCER=HADISST2 (reynolds is also possible).
You need to decide when your experiment will start and when it will finish (start_date and ENDYEAR)
You can chose how many member (ensemble size) you want to use. Beware that if you want to use too few members, EnKF would perform badly (don't even try with 1 or 2 ).
For NorCPM_F19_tn21 and NorCPM_ME there is already an initial ensemble for 1980-01-15 for other starting date you need a more complex manover as detailled in following section.
Setting of the assimilation
You have different option available. First you may decide if you would like to use Anomaly assimilation or full field assimilation. Note that anomaly assimilation is available by default so if you want to use full fill you need to compile EnKF and prep_obs and replace the link in your ${HOME}/NorESM/bin/ by the updated executable.
Before compiling the two executable define the FLAG ANOMALY in :
- /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/MODEL.CPP
- /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP
Same masking of the land neighbour ons are used by default in the version of work shared so compile your own version if you don't want this option.
To reject observations near the coast, define the FLAG MASK_LANDNEIGHBOUR in:
- /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP
You can also activate slow initialisation when doing reanalysis by setting RFACTOR=8 in : ${HOME}/NorESM/personal_setting.sh
It will overestimate the observation error by the RFACTOR. (no that RFACTOR is reducing by 2 after each assimilation). NB: If you want to restart your reanalysis that has crashed remember to change it to 1
You can edit the list of variable that assimilation will update (via linear relation): /work/shared/nn9039k/NorCPM/Input/EnKF/analysisfields.in
Each line describe the number of 2D fields that we are updating for each variable.
Examples:
- temp 1 53: means that we update temperature for layer 1 to 53. Note that in MICOM, there are 2 time level for each variable (leap frog). If you model has kdm vertical level, first time level is 1:kdm while the second is kdm+1:2*kdm
- pb 1 1: means that we only update the first vertical level (pb is a variable with only the two time level)
- ustar 0 0: means that the variable is only 2d and there is no 3rd dimension
You have a lot of data assimilation parameter in :
/work/shared/nn9039k/NorCPM/Input/EnKF/enkf.prm_mal
- method: you can chose between EnKF and DENKF. The first is a stochastic filter and the other is a deterministic filter i.e. that it does not perturb observation (See Sakov et al. 2008 for a more thoroughly description).
- ensemble: contain the ensemble size (TODO make that it matches automatically size of personal setting!!)
- Localisation: is a critical parameter of assimilation. It is an ad-hoc way to limit the impact of observation as a function of the distance to the model grid point. A model grid point ould only be influenced by obs found within the localisation radius (locrad in the prm file and is in km). The larger you localisation radius is the higher degree of freedom you have in your model and the larger the number of member you would need to span it properly.The optimal localisation radius depends on your ensemble size. In NorCPM we are currently only considering single water column update (i.e. locfuntag = "STEP" and locrad = 95). If you want to see the impact of increasing the localisation radius, you can for example use locfuntag = "G&C" and locrad = 600 . Note that the G&C means that the influence will be reduced as a function of the distance by tapering (schur product with a Gaussian distribution). A radius of 600 km would would give an effective radius of ~200 km.
- moderation all made to avoid excessive reducing of ensemble spread:
- inflation is a way to artificially inflate you ensemble spread (multiplicative covariance inflation)
- rfactor1 is the way to do a slow assimilation start (It is handle automatically by the main script of the reanalysis
- rfactor2 is a way to artificially overestimate the observation error when updating the covariance matrix (only available with DEnKF)
- kfacor is a pre-screening of the observation. if the obs error and the likelihood of the ensemble and observation do not overlap, you are inflating the obs error (up to a factor 2) in order to gently pull the model towards the truth.
- files
- jmapfname is used to optimise the distribution of CPU when updating the different fields (remove the line if you are not sure)
- pointfname is used to dump all assimilation diagnostic at a specific point.
- prmest
- Here you can also estimate model output by state augmentation (you can estimate evolving model bias as well)
Running the Analysis
The system must be running on the login node. it is run by the command:
- ${HOME}/NorESM/Script/main.sh
It is recommended to run the following command using screen [4]:
- screen
- ${HOME}/NorESM/Script/main.sh
- ctrl+a+d to quit the screen
- screen -r (to reattach it)
Note that there are 5 available login node on hexagon. Your screen would be only available from the screen you launch it from. To change from one login node to the other, simply type:
- ssh login1" (to access the first login node)
You can check then the progress of your job by looking at the files produced in /work/${USER}/archive/ folder or in /work/${USER}/RESULT/. You can check if your system is still alive by checking the queue (qstat) now and then or reattaching the screen.
How to restart your reanalysis if it stop ?
The system is quite robust, but some crashes may happen (power outage, fork issue, machine slow so that wall time is reached ....)
It is good if you can identify depending on where the system stop (indicated by the last line on your screen).
You can then start edit ${HOME}/NorESM/Script/personal_setting.sh and restart from the start_date and the place in the code you want (playing with SKIPASSIM and SKIPPROP)
If you are not sure of what has happen or where it has crashed, the first step is to identify when is the last restart file available:
- ls /work/${USER}/archive/${VERSION}01/rest/
you can then run the script:
- ${HOME}/NorESM/Script/mv_rst_from_archive_to_run.sh date_to_restart
Then you need to edit ${HOME}/NorESM/Script/personal_setting.sh:
- adjust the start_date accordingly
- set RFACTOR=1 (no need to slow aswim start)
- skip the first assimilation SKIPASSIM=0 (the restart file in archive have already been assimilated)
Now you can relaunch the script main.sh in a screen.
Launching prediction
Launchin prediction is quite simple. We are using the same ensemble structure, so make sure that all file in archive folder are back-up before starting that.
- First,${HOME}/NorESM/Script/mv_rst_from_archive_to_run.sh date_to_restart
- Second, ${HOME}/NorESM/Script/change_running_freq_4_prediction.sh this will set the setting of NorESM to integrate for 10 years instead of a month.
Remember that you will need to change back to monthly frequency if you want to run a REANAYLSIS
- Third Submit the super-job (that will integrate your whole ensemble):
- qsub Integrate_ens_predict.sh
Reduction and Compression of outputs
The amount of storage produced by ensemble run, and it is needed to try to reduce it as much as possible before starting backing up the data. It is recommended to proceed with that regularly while you are progressing with your reanalysis. As such you are sure that you keeping the work usage low. After you are finished with reducing and compressing you can proceed with backing up the data to for example NORSTORE. This should be done in two step:
- First, try to only retain file that you are really going to use. There is currently a script called Reduce_Exp_size.sh that only keep: 4 restart per year, ocean monthly average file, h0 output for atmosphere and land.
- Second,After running the previous step you can convert the output from netcd3 to netcdf4 (this will save approximately 50% of output) and zip ensemble of restart file together. The conversion is done in parallel and should be launched to the queue as follow:
- qsub noresm2netcdf4.pbs
This script will automatically remove the file converted or tared from your archive folder and store then in /work/$USER/Convertion/ with the exact similar name and structure.
Archiving to NorStore
NorStore is a machine in Oslo that provides storage facility. Via EPOCASA project we do have 10 TB of storage on disk and 10 TB of storage on tape. In order to loging to norstore, we are recommending to do so via login3:
- ssh -Y login3.norstore.uio.no'
Disk storage
Our available storage disk is on :
- /projects/NS9039K/shared/norcpm/cases/
We have orginise experiment in 3 parts:
- Preindustrial_spinup, which are long run with pre-industrial forcing
- Historical run, which are ensemble run from 1850-> present (often referred as FREE)
- NorCPM, which are experiment with initialisation
The file README provides a small description of each experiment so please edit it if you add anything. In the Folder Script ythat may be useful (transfert files from the Conversion folder, mv restart to hexagon, identify missing files)
Tape storage
Our available storage on tape is :
- /tape/NS9039K/shared/norcpm/cases/
the command are slight different on tape. You need to use lst and cpt instead of ls and cp. You can use the /scratch/${USER} on norstore to transfert file temporarly before transfering from (or to) hexagon.
How to transfert data from Conversion folder
We have seen above how to Reduce and compress output on Hexagon. In order to backup this data you can follow the folloing list of command:
- Create a new folder in the architecture and edit the README accordingly to describe your new experiment
- Go in the folder and copy the script Transfert_file.sh from /projects/NS9039K/shared/norcpm/cases/Script/
- Run the script. This will copy and remove the file from hexagon.