NorCPM User Manual: Difference between revisions

From Norcpm
No edit summary
No edit summary
Line 61: Line 61:
   
   


=== Obtaining and installing the model ===  
=== Installing compiling NorCPM ===  
To install NorCPM on your account, follow the step:
To install NorCPM on your account, follow the step:


Line 160: Line 160:


==== Configuring the initialisation ====   
==== Configuring the initialisation ====   
You have different option available. First you may decide if you would like to use Anomaly assimilation or full field assimilation.
This is done when compiling the EnKF code
define the FLAG ANOMALY before compilation in :
*''/work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/MODEL.CPP''
*''/work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP''
To reject observation near the coast, define the FLAG MASK_LANDNEIGHBOUR in:
*''/work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP''
You can also activate slow initialisation when doing reanalysis by setting RFACTOR=8 in :
${HOME}/NorESM/personal_setting.sh
It will overestimate the observation error by the RFACTOR. (no that RFACTOR is reducing by 2 after each assimilation).
'''NB: If you want to restart your reanalysis that has crashed remember to change it to 1'''
You have a lot of data assimilation parameter in :
''/work/shared/nn9039k/NorCPM/Input/EnKF/enkf.prm_mal''
*method of assimilation you can chose between EnKF and DENKF. The first was is a stochastic filter and the other is a deterministic filter i.e. that it does not perturb observation (See Sakov et al. 2008 for a more thoroughly description).
*ensemble contain the ensemble size ('''TODO make that it matches automatically size of personal setting!!''')
*Localisation is a critical parameter of assimilation. It is an ad-hoc way to limit the impact of observation as a function of the distance to the model grid point. A model grid point ould only be influenced by obs found within the localisation radius (locrad in the prm file and is in km). The larger you localisation radius is the higher degree of freedom you have in your model and the larger the number of member you would need to span it properly.The optimal localisation radius depends on your ensemble size. In NorCPM we are currently only considering single water column update (i.e. locfuntag = "STEP" and locrad =  95). If you want to see the impact of increasing the localisation radius, you can for example use locfuntag = "G&C" and locrad = 600 . Note that the G&C means that the influence will be reduced as a function of the distance by tapering (schur product). 600 km would give an effective radius of ~200 km.
* You also have possibility to play with moderation:
**inflation is a way to artificially inflate you ensemble spread (multiplicative covariance inflation)
**rfactor1 is the way to do a slow assimilation start (It is handle automatically by the main script of the reanalysis
**rfactor2 is a way to artificially overestimate the observation error when updating the covariance matrix (only available with DEnKF)
**kfacor is a pre-screening of the observation. if the obs error and the likelihood of the ensemble and observation do not overlap, you are inflating the obs error (up to a factor 2) in order to gently pull the model towards the truth.
*files
**jmapfname is used to optimise the distribution of CPU when updating the different fields (remove the line if you are not sure)
**pointfname is used to dump all assimilation diagnostic at a specific point.
*prmest
*Here you can also estimate model output by state augmentation (you can estimate evolving model bias as well)


==== Customisation of output  ====   
==== Customisation of output  ====   

Revision as of 14:40, 4 November 2014

PLEASE READ THIS BEFORE EDITING


Ingo Bethke
Francois Counillon
enter your name here (alphabetical order)

Overview

The Norwegian Climate Prediction Model (NorCPM) is aiming at providing prediction from seasonal-to-decadal time scale. It is based on the Norwegian earth system model ([1]) and the Ensemble Kalman Filter ([2]) data assimilation method. NorESM is a state of the art Earth system model that is based on CESM ([3]), but that used different Aerosol/chemistry scheme and ocean model ( evolve from the MICOM). The EnKF is a sequential data assimilation method that allows for fully multivariate and flow dependent correct using the covariance matrix from a Monte-carlo model integration.

Norwegian Earth System Model

The Norwegian Earth System Model (NorESM) is one out of ~20 climate models that has produced output for the CMIP5 (http://cmip-pcmdi.llnl.gov/cmip5). The NorESM-family of models are based on the Community Climate System Model version 4 (CCSM4) of the University Corporation for Atmospheric Research, but differs from the latter by, in particular, an isopycnic coordinate ocean model and advanced chemistry-aerosol-cloud-radiation interaction schemes. The main version NorESM1-M has a horizontal resolution of approximately 2deg for the atmosphere and land components and 1deg for the ocean and ice components. NorESM is also available in a lower resolution version (NorESM1-L), a medium-low resolution version (NorESM1-ML), a high-top version with specified and full chemistry (NorESM1-MLHT and NorESM1-MLHTC) and a version that includes prognostic biogeochemical cycling (NorESM1-ME).

NorESM configurations in NorCPM
Model acronym Ocean Atmosphere References
NorESM1-L Micom (3.6deg) CAM4 (T31) Zhang et al. 2012, Counillon et al. 2014
NorESM1-ML Micom (2deg) CAM4 (2deg)
NorESM1-MLHT Micom (2deg) CAM4-WACCMSC (2deg)
NorESM1-MLHTC Micom (2deg) CAM4-WACCM (2deg)
NorESM1-ME Micom (1deg) CAM4-OSLO (2deg) Tjiputra et al. 2013

Ensemble Kalman Filter

The EnKF is a sequential ensemble based data assimilation method that consists of two steps, a propagation and a correction. The propagation step is a Monte Carlo method. The ensemble spread (i.e. ensemble variability) is used to estimate the forecast error, because they are expected to be related in locations (and times) where (and when) the system is more chaotic. Assuming that the distribution of the error is Gaussian and the model is not biased one can proceed with the tBayesian update and find new estimate of the ensemble mean and model covariance. The method is often called as flow dependent as the covariance matrix evolves with the system and thus provide correction that are in agreement with the state of the system. The method allows fully multivariate updated - meaning that observation of for example SST can be used to apply correction on all other model variables. However one should bear in mind that the update assume linearity, which is not suitable for all variable and that correlation are subject to sampling error. Currently NorCPM uses the Deterministic Ensemble Kalman Filter (DEnKF, Sakov et al. 2008), which is a square root filter version of the EnKF.

Getting started with NorESM

Prerequisites

User-support for NorCPM is currently limited to Norway.

Step 1: New users need to apply for access to computational and storage resources at the Norwegian Metacenter for Computational Science (link to application page: https://www.notur.no/user-account). NorCPM activities are usually tied to the cpu and storage accounts nn9039k and ns9039k, which are held by Noel Keenlyside (noel.keenlyside[at]gfi.uib.no). NorCPM is currently set up on the computational platform HEXAGON (https://www.notur.no/hardware/hexagon).

Step 2: After gaining access to HEXAGON, the user needs to contact the local support (support-uib@notur.no) to be added to the unix-groups "noresm" and "nn9039k".


Installing compiling NorCPM

To install NorCPM on your account, follow the step:

1) install NorESM and link the script necessary :

cd ${HOME}
mkdir -p NorESM
cd NorESM

If you have a NoreSM svn access do:

svn checkout https://svn.met.no/NorESM/noresm/tags/projectEPOCASA-3 projectEPOCASA-3

if you don't do:

tar -xvf /work-common/shared/nn9039k/NorCPM/Code/NorESM/projectEPOCASA-3.tar.gz
mkdir -p Script

Now you will use the default Script version

ln -s /work/shared/nn9039k/NorCPM/Script/* .
rm personal_setting.sh
cp /work/shared/nn9039k/NorCPM/Script/personal_setting.sh .
cd ${HOME}/NorESM/
mkdir -p bin
cd bin

Same with bin, for using the default version link the file. If you want to create your own, Copy and compile the code in /work-common/shared/nn9039k/NorCPM/Code/EnKF/ delete the link in bin and move your own executable there

ln -sf /work/shared/nn9039k/NorCPM/bin/* .

2) Select of a model version and experiment :

Need to edit ${HOME}/Script/personal_setting.sh to chose a model version, ensemble size, starting date, ...

Launch the creation of the ensemble structure.

cd ${HOME}/NorESM/Script
./create_ensemble.sh

You structure of ensemble is created, the code compiled and the initial condition copied. You are ready to start your reanalysis.

Model and directory structure

Shared files on hexagon are located here:

/work/shared/nn9039k/NorCPM/

The subfolder:

-Code contains source code of all fortran code needed (NorESM, EnKF, Post processing)
-Script contains all bash script necessary to run the reanalysis or prediction
-Restart contains the initial condition (restart files) for two different configuration of NorESM in 1980-01-15
-Obs contains observation that are available for assimilation (SST,SSH)
-bin contains compiled executable from the Code subfolder
-Input contains input files both for NorESM and EnKF
-matlab contains code used for validation purpose

In your home folder ${HOME}/NorESM/ you have your personal file bin, Script and projectEPOCASA-3 that are copied or linked from the /work/shared

cases contains the specification of your ensemble of experiment. Each ensemble members have its own separate experiment with limitation of the duplicate.

Option available in NorCPM

Most of the setting are selected in the file ${HOME}/NorESM/personal_setting.sh

There is currently two versions of NorESM available: NorCPM_F19_tn21 and NorCPM_ME

-NorCPM_F19_tn21 is the default version: It has F19 (about 2 dergree resolution) and uses CAM5 for the atmosphere and 2 degree for the ocean on a tripolar grid.
-NorCPM_ME (or f19_g16) is the medium resolution used for CMIP5: It uses CAM-OSLO and has a F19 resolution for the atmosphere (about 2 dergree resolution) and about 1 degree for the ocean (bipolar grid).

About the type of observation available is OBSTYPE=SST and PRODUCER='HADISST2' (reynolds is also possible)

You need to decide when your experiment will start and when it will finish (start_date and ENDYEAR)

You can chose how many member (ensemble size) you want to use. Beware that if you want to use too few members, EnKF would perform badly (don't even try with 1 or 2 ).

For NorCPM_F19_tn21 and NorCPM_ME there is already an initial ensemble for 1980-01-15 for other starting date please see XXX


Cloning existing experiments

Configuring the initialisation

You have different option available. First you may decide if you would like to use Anomaly assimilation or full field assimilation. This is done when compiling the EnKF code define the FLAG ANOMALY before compilation in :

  • /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/MODEL.CPP
  • /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP

To reject observation near the coast, define the FLAG MASK_LANDNEIGHBOUR in:

  • /work/shared/nn9039k/NorCPM/Code/EnKF-MPI-TOPAZ/Prep_Routines/MODEL.CPP

You can also activate slow initialisation when doing reanalysis by setting RFACTOR=8 in : ${HOME}/NorESM/personal_setting.sh

It will overestimate the observation error by the RFACTOR. (no that RFACTOR is reducing by 2 after each assimilation). NB: If you want to restart your reanalysis that has crashed remember to change it to 1

You have a lot of data assimilation parameter in : /work/shared/nn9039k/NorCPM/Input/EnKF/enkf.prm_mal

  • method of assimilation you can chose between EnKF and DENKF. The first was is a stochastic filter and the other is a deterministic filter i.e. that it does not perturb observation (See Sakov et al. 2008 for a more thoroughly description).
  • ensemble contain the ensemble size (TODO make that it matches automatically size of personal setting!!)
  • Localisation is a critical parameter of assimilation. It is an ad-hoc way to limit the impact of observation as a function of the distance to the model grid point. A model grid point ould only be influenced by obs found within the localisation radius (locrad in the prm file and is in km). The larger you localisation radius is the higher degree of freedom you have in your model and the larger the number of member you would need to span it properly.The optimal localisation radius depends on your ensemble size. In NorCPM we are currently only considering single water column update (i.e. locfuntag = "STEP" and locrad = 95). If you want to see the impact of increasing the localisation radius, you can for example use locfuntag = "G&C" and locrad = 600 . Note that the G&C means that the influence will be reduced as a function of the distance by tapering (schur product). 600 km would give an effective radius of ~200 km.
  • You also have possibility to play with moderation:
    • inflation is a way to artificially inflate you ensemble spread (multiplicative covariance inflation)
    • rfactor1 is the way to do a slow assimilation start (It is handle automatically by the main script of the reanalysis
    • rfactor2 is a way to artificially overestimate the observation error when updating the covariance matrix (only available with DEnKF)
    • kfacor is a pre-screening of the observation. if the obs error and the likelihood of the ensemble and observation do not overlap, you are inflating the obs error (up to a factor 2) in order to gently pull the model towards the truth.
  • files
    • jmapfname is used to optimise the distribution of CPU when updating the different fields (remove the line if you are not sure)
    • pointfname is used to dump all assimilation diagnostic at a specific point.
  • prmest
  • Here you can also estimate model output by state augmentation (you can estimate evolving model bias as well)


Customisation of output

Setting up atmospheric nudging

Running the model

Initial run

Continuation of existing run

Running ensembles as single jobs

Getting started with EnKf

Post-processing and long-term storage

Output compression

Archiving output on NorStore

Disk storage

Tape storage

National archive

Diagnostics and analysis