NorCPM User Manual

From Norcpm
Revision as of 10:04, 5 November 2014 by Francois (talk | contribs)

PLEASE READ THIS BEFORE EDITING


Ingo Bethke
Francois Counillon
enter your name here (alphabetical order)

Overview

The Norwegian Climate Prediction Model (NorCPM) is aiming at providing prediction from seasonal-to-decadal time scale. It is based on the Norwegian earth system model ([1]) and the Ensemble Kalman Filter ([2]) data assimilation method. NorESM is a state of the art Earth system model that is based on CESM ([3]), but that used different Aerosol/chemistry scheme and ocean model ( evolve from the MICOM). The EnKF is a sequential data assimilation method that allows for fully multivariate and flow dependent correct using the covariance matrix from a Monte-carlo model integration.

Norwegian Earth System Model

The Norwegian Earth System Model (NorESM) is one out of ~20 climate models that has produced output for the CMIP5 (http://cmip-pcmdi.llnl.gov/cmip5). The NorESM-family of models are based on the Community Climate System Model version 4 (CCSM4) of the University Corporation for Atmospheric Research, but differs from the latter by, in particular, an isopycnic coordinate ocean model and advanced chemistry-aerosol-cloud-radiation interaction schemes. The main version NorESM1-M has a horizontal resolution of approximately 2deg for the atmosphere and land components and 1deg for the ocean and ice components. NorESM is also available in a lower resolution version (NorESM1-L), a medium-low resolution version (NorESM1-ML), a high-top version with specified and full chemistry (NorESM1-MLHT and NorESM1-MLHTC) and a version that includes prognostic biogeochemical cycling (NorESM1-ME).

NorESM configurations in NorCPM
Model acronym Ocean Atmosphere References
NorESM1-L Micom (3.6deg) CAM4 (T31) Zhang et al. 2012, Counillon et al. 2014
NorESM1-ML Micom (2deg) CAM4 (2deg)
NorESM1-MLHT Micom (2deg) CAM4-WACCMSC (2deg)
NorESM1-MLHTC Micom (2deg) CAM4-WACCM (2deg)
NorESM1-ME Micom (1deg) CAM4-OSLO (2deg) Tjiputra et al. 2013

Ensemble Kalman Filter

The EnKF is a sequential ensemble based data assimilation method that consists of two steps, a propagation and a correction. The propagation step is a Monte Carlo method. The ensemble spread (i.e. ensemble variability) is used to estimate the forecast error, because they are expected to be related in locations (and times) where (and when) the system is more chaotic. Assuming that the distribution of the error is Gaussian and the model is not biased one can proceed with the tBayesian update and find new estimate of the ensemble mean and model covariance. The method is often called as flow dependent as the covariance matrix evolves with the system and thus provide correction that are in agreement with the state of the system. The method allows fully multivariate updated - meaning that observation of for example SST can be used to apply correction on all other model variables. However one should bear in mind that the update assume linearity, which is not suitable for all variable and that correlation are subject to sampling error. Currently NorCPM uses the Deterministic Ensemble Kalman Filter (DEnKF, Sakov et al. 2008), which is a square root filter version of the EnKF.

Getting started with NorCPM

Prerequisites

User-support for NorCPM is currently limited to Norway.

Step 1: New users need to apply for access to computational and storage resources at the Norwegian Metacenter for Computational Science (link to application page: https://www.notur.no/user-account). NorCPM activities are usually tied to the cpu and storage accounts nn9039k and ns9039k, which are held by Noel Keenlyside (noel.keenlyside[at]gfi.uib.no). NorCPM is currently set up on the computational platform HEXAGON (https://www.notur.no/hardware/hexagon).

Step 2: After gaining access to HEXAGON, the user needs to contact the local support (support-uib@notur.no) to be added to the unix-groups "noresm" and "nn9039k".


Installing compiling NorCPM

To install NorCPM on your account, follow the step:

1) install NorESM and link the script necessary :

cd ${HOME}
mkdir -p NorESM
cd NorESM

If you have a NoreSM svn access do:

svn checkout https://svn.met.no/NorESM/noresm/tags/projectEPOCASA-3 projectEPOCASA-3

if you don't do:

tar -xvf /work-common/shared/nn9039k/NorCPM/Code/NorESM/projectEPOCASA-3.tar.gz
mkdir -p Script

Now you will use the default Script version

ln -s /work/shared/nn9039k/NorCPM/Script/* .
rm personal_setting.sh
cp /work/shared/nn9039k/NorCPM/Script/personal_setting.sh .
cd ${HOME}/NorESM/
mkdir -p bin
cd bin

Same with bin, for using the default version link the file. If you want to create your own, Copy and compile the code in /work-common/shared/nn9039k/NorCPM/Code/EnKF/ delete the link in bin and move your own executable there

ln -sf /work/shared/nn9039k/NorCPM/bin/* .

2) Select of a model version and experiment :

Need to edit ${HOME}/Script/personal_setting.sh to chose a model version, ensemble size, starting date, ...

Launch the creation of the ensemble structure.

cd ${HOME}/NorESM/Script
./create_ensemble.sh

You structure of ensemble is created, the code compiled and the initial condition copied. You are ready to start your reanalysis.

Model and directory structure

Shared files on hexagon are located here:

/work/shared/nn9039k/NorCPM/

The subfolder:

-Code contains source code of all fortran code needed (NorESM, EnKF, Post processing)
-Script contains all bash script necessary to run the reanalysis or prediction
-Restart contains the initial condition (restart files) for two different configuration of NorESM in 1980-01-15
-Obs contains observation that are available for assimilation (SST,SSH)
-bin contains compiled executable from the Code subfolder
-Input contains input files both for NorESM and EnKF
-matlab contains code used for validation purpose

In your home folder ${HOME}/NorESM/ you have your personal file bin, Script and projectEPOCASA-3 that are copied or linked from the /work/shared

cases contains the specification of your ensemble of experiment. Each ensemble members have its own separate experiment with limitation of the duplicate.

Option available in NorCPM

Most of the setting are selected in the file ${HOME}/NorESM/personal_setting.sh

There is currently two versions of NorESM available: NorCPM_F19_tn21 and NorCPM_ME

-NorCPM_F19_tn21 is the default version: It has F19 (about 2 dergree resolution) and uses CAM5 for the atmosphere and 2 degree for the ocean on a tripolar grid.
-NorCPM_ME (or f19_g16) is the medium resolution used for CMIP5: It uses CAM-OSLO and has a F19 resolution for the atmosphere (about 2 dergree resolution) and about 1 degree for the ocean (bipolar grid).

About the type of observation available is OBSTYPE=SST and PRODUCER='HADISST2' (reynolds is also possible)

You need to decide when your experiment will start and when it will finish (start_date and ENDYEAR)

You can chose how many member (ensemble size) you want to use. Beware that if you want to use too few members, EnKF would perform badly (don't even try with 1 or 2 ).

For NorCPM_F19_tn21 and NorCPM_ME there is already an initial ensemble for 1980-01-15 for other starting date please see XXX


Cloning existing experiments

Setting up the initialisation

You have different option available. First you may decide if you would like to use Anomaly assimilation or full field assimilation. This is done when compiling the EnKF code define the FLAG ANOMALY before compilation in :

  • /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/MODEL.CPP
  • /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP

To reject observation near the coast, define the FLAG MASK_LANDNEIGHBOUR in:

  • /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP

You can also activate slow initialisation when doing reanalysis by setting RFACTOR=8 in : ${HOME}/NorESM/personal_setting.sh

It will overestimate the observation error by the RFACTOR. (no that RFACTOR is reducing by 2 after each assimilation). NB: If you want to restart your reanalysis that has crashed remember to change it to 1

You can edit the list of variable that assimilation will update (via linear relation): /work/shared/nn9039k/NorCPM/Input/EnKF/analysisfields.in Each line describe a model field to be updated:

If the variable is 3D (x,y,z) the number indicates the layer (in z) that we are updating.


You have a lot of data assimilation parameter in : /work/shared/nn9039k/NorCPM/Input/EnKF/enkf.prm_mal

  • method of assimilation you can chose between EnKF and DENKF. The first was is a stochastic filter and the other is a deterministic filter i.e. that it does not perturb observation (See Sakov et al. 2008 for a more thoroughly description).
  • ensemble contain the ensemble size (TODO make that it matches automatically size of personal setting!!)
  • Localisation is a critical parameter of assimilation. It is an ad-hoc way to limit the impact of observation as a function of the distance to the model grid point. A model grid point ould only be influenced by obs found within the localisation radius (locrad in the prm file and is in km). The larger you localisation radius is the higher degree of freedom you have in your model and the larger the number of member you would need to span it properly.The optimal localisation radius depends on your ensemble size. In NorCPM we are currently only considering single water column update (i.e. locfuntag = "STEP" and locrad = 95). If you want to see the impact of increasing the localisation radius, you can for example use locfuntag = "G&C" and locrad = 600 . Note that the G&C means that the influence will be reduced as a function of the distance by tapering (schur product). 600 km would give an effective radius of ~200 km.
  • You also have possibility to play with moderation:
    • inflation is a way to artificially inflate you ensemble spread (multiplicative covariance inflation)
    • rfactor1 is the way to do a slow assimilation start (It is handle automatically by the main script of the reanalysis
    • rfactor2 is a way to artificially overestimate the observation error when updating the covariance matrix (only available with DEnKF)
    • kfacor is a pre-screening of the observation. if the obs error and the likelihood of the ensemble and observation do not overlap, you are inflating the obs error (up to a factor 2) in order to gently pull the model towards the truth.
  • files
    • jmapfname is used to optimise the distribution of CPU when updating the different fields (remove the line if you are not sure)
    • pointfname is used to dump all assimilation diagnostic at a specific point.
  • prmest
  • Here you can also estimate model output by state augmentation (you can estimate evolving model bias as well)

Running the system

The system must be running on the login node. it is run by the command:

${HOME}/NorESM/Script/main.sh

It is recommended to run the following command using screen [4]:

screen
${HOME}/NorESM/Script/main.sh
ctrl+a+d" (to quit the screen"
"screen -r" (to reattach it)

Note that there are 5 available login node on hexagon. Your screen would be only available from the screen you launch it from. To change from one login node to the other, simply type:

ssh login1" (to access the first login node)

How to restart if it crashes

The system is quite robust, but some crashes may happen (power outage, fork issue, machine slow so that wall time is reached ....) It is good if you can identify depending on where the system stop (the last line on your screen). You can then start edit ${HOME}/NorESM/Script/personal_setting.sh and restart from the start_date and the place in the code you want (playing with SKIPASSIM and SKIPPROP)


If you are not sure of what has happen or where it has crashed, the first step is to identify when is the last restart file available:

ls /work/${USER}/archive/${VERSION}01/rest/

you can then run the script:

${HOME}/NorESM/Script/mv_rst_from_archive_to_run.sh date_to_restart

Then you need to edit ${HOME}/NorESM/Script/personal_setting.sh:

adjust the start_date accordingly
set RFACTOR=1 (no need to slow aswim start)
skip the first assimilation SKIPASSIM=0 (the restart file in archive have already been assimilated)

Now you can relaunche the script main.sh in a screen.

Customising output

Setting up atmospheric nudging

Initial run

Continuation of existing run

Running ensembles as single jobs

Getting started with EnKf

Post-processing and long-term storage

Compression

Archiving to NorStore

Disk storage

Tape storage

National archive

Diagnostics and analysis