Category Archives: WRF

WRF: latitudes, ellipsoids and datums

Background

This post came about from thinking about how to properly project WRF output for visualisation using Cartopy. It is also relevant for anyone wondering how to properly interpolate to real world locations if high accuracy is needed. A follow-up post will deal with using WRF and Cartopy

Location, location, location

As with any earth-science visualisation, the most confusing part is understanding coordinate systems and projections. It should be simple – we have rectangular arrays representing variables and we wish to display them on a rectangular part of the screen. How does it get so complicated? When you set up a WRF simulation, you define a projection. That is, you specify how a part of the curved earth is going to be represented in rectangular arrays in memory and disk. A good article on projections can be found here. An important point to note is that WRF and all other climate models assume a spherical earth, which makes their code much simpler.

Latitude is ambigouous

The latitude of a point on the surface of a sphere is unambiguously the angle from the equator to that point. When you start considering the Earth as a ellipse or something lumpier …

  • sphere, one parameter e.g. WRF R=6 370 000m
  • ellipsoid, two paramaters e.g. WGS84 a=6 378 137m, b=6 356 752.3142m
  • geoid, lumpy

… then it gives rise to some abiguities when definining latitude, shown schematically below image from here. According to Wikipedia “The terminology for latitude must be made more precise by distinguishing:

three_latitudes

Spherical latitude: the angle between the normal to a spherical reference surface and the equatorial plane.

Geodetic latitude: the angle between the normal and the equatorial plane. The standard notation in English publications is φ. This is the definition assumed when the word latitude is used without qualification. The definition must be accompanied with a specification of the ellipsoid.

Geocentric latitude: the angle between the radius (from centre to the point on the surface) and the equatorial plane. There is no standard notation: examples from various texts include ψ, q, φ’, φc, φg.

Atsronomic latitude the angle between the equatorial plane and the true vertical (plumb line) at a point on the surface.

Geographic latitude must be used with care. Some authors use it as a synonym for geodetic latitude whilst others use it as an alternative to the astronomical latitude.”

WRF Preprocessing Georeferenced Data

An important but not well documented point is how georeferenced input data is handled in WRF. This includes terrain height, land-use data, and boundary conditions from GFS or other models. According to this post on using your own high-resolution data, “WPS considers the Earth to be a perfect sphere … default land-use data used by WPS are projected using the WGS_1984 coordinate system. Due to this reason, it is preferred to convert all the land data to the following projection GCS_WGS_1984”

From this, I infer:

  1. WPS treats all latitudes and longitudes as spherical, indentifying points on a sphere.
  2. Static data like land-use and terrain data are most likely georefrenced against the WGS84 ellipsoid.
  3. An implicit transformation is occuring in WPS, whereby the WGS elliposid is projected onto a sperical surface.

Something like this thanks, with each P –> P’

06-01-2016 13-29-36

This is fine, as long as it is consistent between all input data. Once concern is the treatment of GFS or other global model data. How are the observations handled in the GFS data assimilation system? In the best case, the same approach is used, and we can sleep easily at night. What if a different transformation is being applied to transform observations to the spherical system used in GFS?

In that case, when we overlay meteorological data from GFS over the static data from WPS, there could be a displacement. Does it matter? Perhaps not: with a decent spin-up, the physics of the model should ensure the atmosphere comes into balance with its physical surroundings. But what if we are continously nudging the model with GFS data? Then it could be a problem.

The Web Mercartor Projection

Interestingly, the “Web Mercator” projection widely used in web mapping uses a similar scheme. The Earth is assumed to be spherical, yet the coordinates of points are required to be relative to the WGS84 ellipsoid. Wikipedia claims that this makes it slighly non-conformal, so presumably the same applies to LCC defined in WRF. How much affect this non-conformity has on the underlying equations in the model, I have no idea. I suspect it is insignificant compared to other sources of error.

Interpolating and reprojecting WRF output

There are two ways to interpolate WRF output to locations

  1. Use the latitude and longitude arrays as coordinates, and interpolate using them (ala rcm2points function in NCL)
  2. use the projection information to convert a latitide and longitiude into a grid-point index (ideally a decimal index), and then interpolate from the surrounding grid points (ala wrf_ll_to_ij in ncl)

In both of those situations, we shoudl get the correct result if we just take our geodetic WGS84 latitude and longitude and plug it in. However, if we want to reproject a whole WRF domain for the purpose of displaying it overlaid with other datasets, then we need to be a bit more cautious. There is a blog post here which deals with some of the same issues.

Conclusion

After a long winded search, I’ve come to conclusion that everything it probably OK. Observations are geolocated with a latitude relevant to the WGS94 ellipsoid, spherical models treat this as a spherical latitude. Most users do the same, and it probably all works.

Automated WRF tools released!

There are a number of tools which automate running WRF, including the WRF Portal, and the WRF EMS. However, the problem with these tools is they try to be too good. That is, they try and and take all of the work away from you, expecting you to press go and wait for results. Which is great while it works, but inevitably something goes wrong, or you need to change something. What if you need to force in your own land-use data somewhere? Suppose you need to call ungrib.exe a few times? I found myself having to rely on my own scripts, and judging by a quick scout around the internet, plenty of other people are too.

So I began (I said began, not finished!) the tidying up of my scripts, to produce some kind of framework for running WRF which (a) automates all the steps, but (b) is very easy to modify and extend.

https://github.com/samwisehawkins/wrftools/tree/master

What it does:

  • Fetches boundary conditions (via the excellent gribmaster)
  • Does all file renaming and linking
  • Runs ungrib, geogrid, metgrid and real (including with multiple Vtables)
  • Runs WRF
  • Runs these either directly using mpirun, or by submission to a job queue
  • Compresses output files (as long as NCO is installed with NetCDF4 support)
  • Extracts time series at points and heights above ground level (using NCL)
  • Produces visualisations (using NCL)
  • Any amount of copying and archiving

What it does not:

  • Fetch or compile WRF or any other tools NCO, NCL etc
  • Buy you lunch

The best thing is, it is written in python. That’s right atmospheric scientists, python. Not perl, not tcsh, not some other language from 1967 which you found behind the sofa. Which means it is easy to read, understand, document, and maintain. It is configured via a YAML file, and/or command-line options. It can be found here:

Displaying WRF data using OpenLayers

Introduction

So, you’re bored of well made, clear, beautiful but static plots which can show people in one glance what the weather is doing. You want exciting slippy maps that people can zoom around and go “oooooohh pretty!”. I’ve attempted this several times in the past, and each time been thoroughly beaten, as my lovely pressure field from WRF ends up as a single pixel somewhere off the coast of east Africa.

Well the good news is, there is a comparatively easy method of doing this, thanks to a bunch of libraries and programs which do the hard work. The even better news is that if you are already producing plots using NCL, then it just requires a couple of modifications to your scripts.

My downfall in the past was always re-projecting WRF data. Because WRF is a grid based model, where the curved atmosphere is represented by rectangular arrays, your WRF data is already projected. You have hopefully chosen your projection to minimise distortion over your domain, and unless you have very specifically designed your WRF domain for web mapping, this won’t be the same system used by common web tools, and you will have to re-project your data before you can overlay it on a web map. There are a number of points where you could do the re-projection:

  1. The raw netcdf files can be re-projected using a post-processing tool such as UPP
    • UPP is a bit fiddly, and returns grib files
  2. The arrays within the netcdf files can be re-projected using GDAL,NCL, or Python
    • handling the variables within the netCDF files is tricky with GDAL ( see Reading netcdf files with GDAL and re-project files with GDAL for hints)
    • doing re-projection well in Python is a bit of a pain
    • NCL is probably the most adept at this, particularly with the Earth System Model regridding tools in latest versions
  3. The images produced by NCL or other tools can be made on a new projection
    • NCL or Basemap in Python can re-project at the time a plot is produced
    • The image needs to take up the whole frame, without extra stuff
  4. The images produced by NCL can be re-projected after the event using GDAL

I find Option 4 by far the easiest way, since the excellent VAPOR package lets you export properly georeferenced geotiffs directly from NCL.

Dependencies

  • NCL
  • VAPOR
  • GDAL
  • ImageMagik

Geotiffs from NCL and VAPOR

Vapor includes the script $VAPOR_ROOT/share/examples/NCL/wrf2geotiff.ncl. See the examples here for full instructions.

I had to set the workstation type to oldps, not ps as stated in the docs, and I also had to output one frame per geotiff in order for the re-projection to work. The key modifications to an NCL script are:

load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/wrf/WRFUserARW.ncl"

; load the wrf2geotiff library from the VAPOR distribution:
load "$VAPOR_ROOT/share/examples/NCL/wrf2geotiff.ncl"

; do everything else as normal, e.g. get data, loop over time, etc. etc.
....

type      = "oldps"
wks       = gsn_open_wks(type,plot_name) ; I open and delete a workstation each timestep
wrf2gtiff = wrf2geotiff_open(wks)

; set your plot, map, contour options etc
mpres = True

; make the plots
contour_psl = wrf_contour(a,wks,slp,opts_psl)

; overlay on the map
plot = wrf_map_overlays(a,wks,(/contour_psl/),pltres,mpres)

; write into the geotiff object
wrf2geotiff_write(wrf2gtiff, a, times(it), wks, plot, True)

; advance frame between time steps after the wrf2geotiff_write
frame(wks)

; this triggers actual writing of all geotiff output frames
wrf2geotiff_close(wrf2gtiff,wks)

delete(wks)
delete(wrf2gtiff)

Reprojecting: the easy way

Assume we now have the geotiff precip.tiff. Those clever folks at vapor have correctly georeferenced it and set all the metadata. I’m using a Lambert Conformal Conic projection, and I can look at the projection info using gdalinfo.

$>gdalinfo precip.tiff

Size is 995, 659
Coordinate System is:
PROJCS["unnamed",
    GEOGCS["unknown",
        DATUM["unknown",
            SPHEROID["unretrievable - using WGS84",6378137,298.257223563]],
        PRIMEM["Greenwich",0],
        UNIT[,0.0174532925199433]],
    PROJECTION["Lambert_Conformal_Conic_2SP"],
    PARAMETER["standard_parallel_1",59.5],
    PARAMETER["standard_parallel_2",49.5],
    PARAMETER["latitude_of_origin",56],
    PARAMETER["central_meridian",-1.5],
    PARAMETER["false_easting",0],
    PARAMETER["false_northing",0],
    UNIT["unknown",1]]
Origin = (-1998307.075827066786587,1323201.405077444389462)
Pixel Size = (4020.745950011026252,-4021.886754136362924)
...

This is very useful info, even if you want to try a different approach, since sometimes it’s hard to work out how the WRF projection options translate into WKT or Proj4 strings.

The only concerning thing is that the spheroid is not set, and defaults to WGS spheroid, whereas WRF uses a sphere. Perhaps this might be fixed in a later version, or you could fix it with GDAL to set the datum. It could lead to features being displaced, see this paper. However, for coarse display purposes I’ll ignore that for now.

The “Web Mercator” projection

The most commonly used projection in the web mapping world (google maps, bing, osm) has become known as the web mercartor. There is an excellent article about this projection. The short version is that it now has the code EPSG:3857. The downside is that many conversion routines don’t handle it properly, since it specifies points on the earth using the WGS1984 spheroid, but projects these using a spherical earth.

It easy to re-project the geotiff to web mercartor projection using gdalwarp:

$> gdalwarp -t_srs EPSG:3857 precip.tiff  precip_warped.tiff

Check it looks right, then also grab some info which we will need later:

$> gdalinfo precip_warped.tiff
Size is 1322, 780
Coordinate System is:
PROJCS["WGS 84 / Pseudo-Mercator",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433],
    AUTHORITY["EPSG","4326"]],
PROJECTION["Mercator_1SP"],
PARAMETER["central_meridian",0],
PARAMETER["scale_factor",1],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1,
    AUTHORITY["EPSG","9001"]],
EXTENSION["PROJ4","+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext  +no_defs"],
AUTHORITY["EPSG","3857"]]
Origin = (-4708884.528225043788552,10384546.176803732290864)
Pixel Size = (6876.374362466389357,-6876.374362466389357)

Corner Coordinates:
Upper Left  (-1998307.076, 1323201.405) ( 42d18' 2.27"W, 62d31' 7.21"N)
Lower Left  (-1998307.076,-1327221.966) ( 25d12'38.84"W, 41d 3'55.88"N)
Upper Right ( 2002335.144, 1323201.405) ( 39d21'56.03"E, 62d29'56.26"N)
Lower Right ( 2002335.144,-1327221.966) ( 22d15'18.11"E, 41d 3'13.56"N)
Center      (    2014.034,   -2010.280) (  1d28' 3.44"W, 55d58'54.76"N)

Now I recode the no-data values as something which will become transparent. This may depend on what you are plotting, but for precipitation, we can safely convert no-data to 0.

$>gdal_translate -a_nodata 0 precip_warped.tiff precip_mask.tiff 

Then I convert to a png, with what was encoded as nodata as set to transparent thanks

$> gdal_translate -of PNG -scale precip_mask.tiff precip.png   

Finally, I needed to add a hack. Although I tried to make my lowest contours (i.e. no rainfall) transparent, they came out white after all that conversion. So I also convert white to transparent using ImageMagik, so the rain looks better overlaid on a map:

$>convert precip.png -transparent white precip_final.png

I wrap up those commands in a bash script:

#!/bin/bash

for f in "$@"
do
    bname=`basename "$f" .tiff`
    echo $bname
    gdalwarp -t_srs EPSG:3857 $f $bname.tmp1.tiff
    gdal_translate -a_nodata 0 $bname.tmp1.tiff $bname.tmp2.tiff
    gdal_translate -of PNG -scale $bname.tmp2.tiff $bname.tmp3.png
    convert $bname.tmp3.png -transparent white $bname.png
    rm *.tmp?.tiff
    rm *.tmp?.png
done

Overlay in OpenLayers

Once you have the re-projected png, it is pretty straightforward to use it as an image overlay. The size and bounds can be got from the output of gdalinfo as above. I’ll do something fancier in a later post, once I have some nice animation controls, but the important javascript snippets look something like:

<script src="../OpenLayers/OpenLayers.js"></script>

  function init() {
    map = new OpenLayers.Map("mapCanvas");
    var osm            = new OpenLayers.Layer.OSM();

    var imgOpt = {
      isBaseLayer: false,
      maxResolution: "auto",
      resolutions: map.layers[0].resolutions,
      projection: map.getProjectionObject(),
      strategies: [new OpenLayers.Strategy.Fixed()],
      displayInLayerSwitcher: true,
      transparent:true, 
      format:"image/png", 
      alpha:true, 
      opacity:0.4 
    };
    var imgBounds =  new OpenLayers.Bounds(-4708884.528, 5020974.174, 4381682.379,10384546.177);
    var imgSize   = new OpenLayers.Size(1322, 780);
    imgUrl        = "precip_final.png";        
    image         = new OpenLayers.Layer.Image("precip", imgUrl,imgBounds, imgSize,imgOpt);

};

MERRA data

“The Modern Era Retrospective-analysis for Research and Applications (MERRA) products are generated using Version 5.2.0 of the GEOS-5 DAS with the model and analysis each at 1/2×2/3 degrees resolution. Three-dimensional analyses are generated every 6 hours, and 3- dimensional diagnostics, describing the radiative and physical properties of the atmosphere, are 3-hourly. The product suite includes analyses on the native vertical grid as well on pressure surfaces. Two-dimensional data, including surface, fluxes, and vertical integrals, are produced hourly. The product suite includes monthly and monthly diurnal files. The MERRA production is being conducted in 3 separate streams, 1979 – 1989; 1989 – 1998; 1998 – present. Data are being uploaded to the MDISC after undergoing quality assurance in the GMAO.”http://gmao.gsfc.nasa.gov/merra/news/merra-land_release.php

A summary of MERRA can be found in the [brouchure](http://gmao.gsfc.nasa.gov/pubs/brochures/MERRA\ Brochure.pdf), and the best detailed information about MERRA is found in the Readme file. Data can be accessed from Goddard Earth Sciences Data Centre.

“Hourly, three-hourly, and six hourly collections consists of daily files. For collections of monthly or seasonal means, each month or season is in a separate file.”

One attraction of MERRA is the hourly resolution of some variables. Only ‘2D’ variables are available at hourly resolution, but these ‘2D’ variables are available at different heights.

Underlying file names

File names consist of five dot-delimited nodes, runid.runtype.config.collection.timestamp. Where:

Node Description
run_id MERRASVv where S=stream number and Vv=Version number
runtype prod=standard product,
ovlp=overlapping product,
spnp=spin up product,
rosb=reduced observing system product,
cers=CERES observing system product
config assim = assimilation. Uses a combination of atmospheric data analysis and model forecasting to generate a time-series of global atmospheric quantities.
simul=simulation. Uses a free-running atmospheric model with some prescribed external forcing, such as sea-surface temperatures.
frcst=forecasts. Uses a free-running atmospheric model initialized from an analyzed state.
collection All MERRA data are organized into file collections that contain fields with common characteristics. Collection names are of the form freq_dims_group_HV, where
freq can be cnst=time-independent, instF=instantaneous, tavgF=time-average and F indicates the frequency or averaging interval and can be 1 = Hourly, 3 = 3-Hourly, 6 = 6-Hourly, M = Monthly mean, U = Monthly-Diurnal mean, 0 = Not Applicable.
dims=2D or 3D,
group lowercase (cryptic) nemomic
HV, where H (horizontal) can be N=native (2/3×1/2), C=reduced (1.25×1.25), F=reduced (1.25×1), and V (vertical) can be: x=horizontal only, p=pressure levels, v=model level centres, e=model level edges
timestamp yyyymmdd

There are two collection naming conventions in operation, described part-by-part in the table below:

Short Name Standard name
MCTFHVGGG
C: Configuration, where C is one of:
A = Assimilation
F = Forecast; 
S = Simulation;
ffffN_dd_ggg_Hv
ffff: Frequency Type, where ffff is one of:
inst = Instantaneous; 
tavg = Time average; 
const = Time independent
MCTFHVGGG
T: Time Description, where T is one of:
I = Instantaneous; 
T = Time Averaged with
1=1hourly,
3=3hourly,
6=6-hourly,
M=monthly; 
C = Time Independent
ffffN_dd_ggg_Hv
N: Frequency, where N is one of:
1 = 1-hourly; 
3 = 3-hourly; 
6 = 6-hourly; 
M = Monthly Mean; 
U = Monthly Diurnal Mean; 
0 = Not Applicable;
MCTFHVGGG
H: Horizontal Resolution, where H is one of:
N = Native (2/3 x 1/2 deg);
F = Reduced Resolution Version of Model Grid (1.25 x 1 deg);
C = Reduced Resolution (1.25 x 1.25 deg)
ffffN_dd_ggg_Hv,
ggg: Group, where ggg is one of:
ana = Direct analysis products;
asm = Assimilated state variables;
tdt = Tendencies of temperature;
udt = Tendencies of eastward and northward wind components;
qdt = Tendencies of specific humidity;
odt = Tendencies of ozone;
lnd = Land surface variables;
flx = Surface turbulent fluxes and related quantities;
mst = Moist processes;
cld = Clouds;
rad = Radiation;
trb = Turbulence;
slv = Single level;
int = Vertical integrals;
chm = Chemistry forcing
MCTFHVGGG
V: Vertical Location, where V is one of:
X = Two-dimensional;
P = Pressure;
V = Model Layer Center;
E = Model Layer Edge
ffffN_dd_ggg_Hv
H: Horizontal Resolution, where H is one of:
N = Native (2/3 x 1/2 deg);
F = Reduced Resolution Version of Model Grid (1.25 x 1 deg);
C = Reduced Resolution (1.25 x 1.25 deg)
MCTFHVGGG
GGG: Group, where GGG is one of:
ANA = Direct analysis products;
ASM = Assimilated state variables;
TDT = Tendencies of temperature;
UDT = Tendencies of eastward and northward wind components;
QDT = Tendencies of specific humidity;
ODT = Tendencies of ozone;
LND = Land surface variables;
FLX = Surface turbulent fluxes and related quantities;
OCN = Ocean quantities;
MST = Moist processes;
CLD = Clouds;
RAD = Radiation;
TRB = Turbulence;
SLV = Single level;
INT = Vertical integrals;
CHM = Chemistry forcing
ffffN_dd_ggg_Hv
v: Vertical Location, where v is one of:
x = Two-dimensional;
p = Pressure;
v = Model Layer Center;
e = Model Layer Edge

Useful collections

Collection Description
MAT1NXSLV
tavg1_2d_slv_Nx
2D IAU Diagnostic, Single Level Meteorology, Time Average 1-hourly, on 2/3×1/2 grid
MAT1NXFLX
tavg1_2d_slv_Nx
2D Surface Fluxes, Single Level Meteorology, Time Average 1-hourly, on 2/3×1/2 grid
MAI6NPANA
inst6_3d_ana_Np
3-Dimensional Instantaneous 6-hourly, on pressure levels, at native resolution
MAI6NVANA
inst6_3d_ana_Nv
3D Meteorology Instantaneous 6-hourly analyzed fields on model layer center on 2/3×1/2 grid
MAIMNPANA
instM_3d_ana_Np
3D analyzed state, meteorology, instantaneous (monthly), on pressure levels, at native resolution
MAIUNPANA
(instU_3d_ana_Np)
MERRA 3D analyzed state, meteorology, instantaneous (diurnal), on pressure levels, at native resolution

A particularly useful dataset (for me) is MAT1NXSLV. Using OpenDAP (more later) to examine the files shows this structure:

http://goldsmr2.sci.gsfc.nasa.gov/opendap/MERRA/MAT1NXSLV.5.2.0/1979/01/MERRA100.prod.assim.tavg1_2d_slv_Nx.19790101.hdf
<type 'netCDF4.Dataset'>
root group (NETCDF3_CLASSIC file format):
HDFEOSVersion: HDFEOS_V2.14
missing_value: 1e+15
Conventions: CF-1.0
title: MERRA reanalysis. GEOS-5.2.0
history: File written by CFIO
institution: Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, MD 20771
source: Global Modeling and Assimilation Office. GEOSops_5_2_0
references: http://gmao.gsfc.nasa.gov/research/merra/
comment: GEOS-5.2.0
contact: http://gmao.gsfc.nasa.gov/
dimensions: TIME, XDim, YDim
variables: SLP, PS, U850, U500, U250, V850, V500, V250, T850, T500, T250, Q850, Q500, Q250, H1000, H850, H500, H250, OMEGA500, U10M, U2M, U50M, V10M, V2M, V50M, T10M, T2M, QV10M, QV2M, TS,     DISPH, TROPPV, TROPPT, TROPPB, TROPT, TROPQ, CLDPRS, CLDTMP, XDim, YDim, TIME, XDim_EOS, YDim_EOS, Time

Where the XDim=540 (i.e. 2/3 degree), YDim=361 (i.e. 1/2 degree), and the long names of the variables are:

Variable name long name
SLP Sea level pressure
PS Time averaged surface pressure
U850 Eastward wind at 850 hPa
U500 Eastward wind at 500 hPa
U250 Eastward wind at 250 hPa
V850 Northward wind at 850 hPa
V500 Northward wind at 500 hPa
V250 Northward wind at 250 hPa
T850 Temperature at 850 hPa
T500 Temperature at 500 hPa
T250 Temperature at 250 hPa
Q850 Specific humidity at 850 hPa
Q500 Specific humidity at 500 hPa
Q250 Specific humidity at 250 hPa
H1000 Height at 1000 hPa
H850 Height at 850 hPa
H500 Height at 500 hPa
H250 Height at 250 hPa
OMEGA500 Vertical pressure velocity at 500 hPa
U10M Eastward wind at 10 m above displacement height
U2M Eastward wind at 2 m above the displacement height
U50M Eastward wind at 50 m above surface
V10M Northward wind at 50 m above the displacement height
V2M Northward wind at 2 m above the displacement height
V50M Northward wind at 50 m above
T10M Temperature at 10 m above the displacement height
T2M Temperature at 2 m above the displacement height
QV10M Specific humidity at 10 m above the displacement height
QV2M Specific humidity at 2 m above the displacement height
TS Surface skin temperature
DISPH Displacement height
TROPPV PV based tropopause pressure
TROPPT T based tropopause pressure
TROPPB Blended tropopause pressure
TROPT Tropopause temperature
TROPQ Tropopause specific humidity
CLDPRS Cloud-top pressure
CLDTMP Cloud-top temperature
XDim longitude
YDim latitude
TIME time
XDim_EOS XDim
YDim_EOS YDim
Time Time

Getting the data

There are at least three different ways of getting the data:

MIRADOR allows http or ftp access to the full global hdf4 files

Simple Subset Wizard Web front-end to an OpenDAP server. Use the web interface to select the collection (using the short name, e.g. MAI6NPANAY), and define a bounding box. The web interface then constructs an OpenDAP request e.g. http://goldsmr3.sci.gsfc.nasa.gov/opendap/MERRA/MAI6NPANA.5.2.0/2010/01/MERRA300.prod.assim.inst6_3d_ana_Np.20100101.hdf.nc?T[0:3][0:41][226:337][188:344],U[0:3][0:41][226:337][188:344],H[0:3][0:41][226:337][188:344],V[0:3][0:41][226:337][188:344],O3[0:3][0:41][226:337][188:344],SLP[0:3][226:337][188:344],QV[0:3][0:41][226:337][188:344],PS[0:3][226:337][188:344],XDim[188:344],TIME[0:3],Height,YDim[226:337]

Use OpenDAP directly. There are some instructions here about using OpenDAP via GRADS to download MERRA data. Information gleaned from that:

  • There are three OpenDAP servers:
    • http://goldsmr1.sci.gsfc.nasa.gov/dods/ Meteorological fields for Chemical Transport Modeling
    • http://goldsmr2.sci.gsfc.nasa.gov/dods/ Two-dimensional fields
    • http://goldsmr3.sci.gsfc.nasa.gov/dods/ Three-dimensional fields

On the servers, the files are grouped by year and month, e.g.

http://goldsmr3.sci.gsfc.nasa.gov/opendap/MERRA/MAI6NPANA.5.2.0/2010/01/

Then you need to know the long name of the underlying hdf file, e.g.

http://goldsmr3.sci.gsfc.nasa.gov/opendap/MERRA/MAI6NPANA.5.2.0/2010/01/MERRA300.prod.assim.inst6_3d_ana_Np.20100101.hdf.nc

Note that the stream number comes into play here, i.e the first digit after the final ‘MERRA’:

  • Stream 1: 1979 – 1989;
  • Stream 2: 1989 – 1998;
  • Stream 3: 1998 – present.

If you have OpenDAP-enabled NetCDF Operators can use ncks:

ncks -v <variables> -d XDim,<xmin><xmax> -d YDim,<ymin><ymax> http://goldsmr2.sci.gsfc.nasa.gov/opendap/MERRA/MAT1NXSLV.5.2.0/<year>/<month>/MERRA<stream_number><version>.prod.assim.tavg1_2d_slv_Nx.<year><month><day>.hdf  <out>.nc

Concrete example:

ncks -v SLP,U850,U500,V850,V500,T850,T500,T250,Q850,Q500,H1000,H850,H500,OMEGA500,U10M,U2M,U50M,V10M,V2M,V50M,T10M,T2M,QV10M,QV2M,TS,DISPH,XDim,YDim,TIME,XDim_EOS,YDim_EOS,Time -d XDim,225,315 -d YDim,262,322 http://goldsmr2.sci.gsfc.nasa.gov/opendap/MERRA/MAT1NXSLV.5.2.0/2010/01/MERRA300.prod.assim.tavg1_2d_slv_Nx.20100101.hdf $HOME/data/reanalysis/MERRA/MERRA300.prod.assim.tavg1_2d_slv_Nx.20100101.nc

WRF with CFSR Data – Part One

I’m gradually piecing together infomation about using the CFSR dataset to drive WRF. I’ll split this post into two halves. The first deals with fetching CFSR data in an automated way from the RDA server at UCAR, the second with actually running WRF.

ds093.0 is the original CFSR dataset, and covers from 1979 to March 2011.

ds094.0 covers from 2011-01-01 up to the present.

The first thing to note is that surface variables are not available at the analysis time, so the WRF group recommend using the 6-hour forecast data. By this it seems they mean you to use the 6-hour forecast for all fields, not just surface fields. I guess this makes running ungrib much simpler, and the forecast error in a 6-hour forecast should be small anyway. According to http://drought.geo.msu.edu/data/CFSR4WRF/

“The 6-hourly CFSR data subsetting interface now includes support for WRF variable tables. By choosing one of these presets, the parameters and levels required by the WRF Preprocessing System (WPS) will automatically be selected. Users will still need to select the product type and grid resolution. To access the subsetting interface, go to http://rda.ucar.edu/datasets/ds093.0/ and click the “Data Access” tab. Then click “Internet Download”, and then the icon in the “Request a Subset…” column.

Because not all parameters/levels are available as analyses from CFSR, the WRF group recommends that users download the 6-hour Forecasts, which is selectable from the “Type of Product:” menu. The high resolution data are on the 0.5-degree grid for the pressure level parameters and on the 0.3-degree grid for the surface parameters. The low resolution data are on the 2.5-degree grid for the pressure level parameters and on the 1.875-degree grid for the surface parameters. These are selectable from the “Grid:” menu.

It has come to our attention that some users are under the assumption that the “pgbh00” and “flxf00” files from NCDC’s NOMADS server are analyses, and that they are using them as inputs to WRF. The fields in those files are NOT analyses; they are output from the first model timestep and they should not be used at all. NCEP asked in January 2010 that those files not be distributed, and they confirmed this in August 2011.

For questions about CFSR data at NCAR or about the subsetting interface, please contact Bob Dattore at dattore@ucar.edu.”

Authentication

When ever you interact with the RDA server, you must be authenticated. You can do this by sending a request and saving a cookie with either of these commands:

curl -o /dev/null -k -s -c <cookie_file_name> -d "email=<email>&passwd=<passwd>&action=login" https://rda.ucar.edu/cgi-bin/login 
wget --save-cookies <cookie_file_name> --post-data="email=<email>&passwd=<passwd>&action=login" https://rda.ucar.edu/cgi-bin/login

Then passing the authentication cookie in all future requests using either:

curl -b <cookie_file_name>
wget --load-cookies <cookie_file_name>

Getting data

There are two steps to automatically fetching data:

  1. Requesting a subset
  2. Downloading the data

A data subset can be requested in two ways, either through the web interface (via the data access tab), or by constructing the correct HTTP string to post. The web interface is self explanatory, so I will deal with the automated way, described here.

A subset request can be verified by posting a correctly formed string to the http://rda.ucar.edu/php/dsrqst-test.php using either the commands:

curl -b <cookie_file_name> -d "<dataset_description>" http://rda.ucar.edu/php/dsrqst-test.php
wget --load-cookies <cookie_file_name> --post-data "<dataset_description>" http://rda.ucar.edu/php/dsrqst-test.php

Then the subset request can be queued using either of the commands:

curl -b <cookie_file_name> -d "<dataset_description>" http://rda.ucar.edu/php/dsrqst.php
wget --load-cookies <cookie_file_name> --post-data "<dataset_description>" http://rda.ucar.edu/php/dsrqst.php

Where the <dataset_description> is one of the WRF presets described below. To subset data geographically, add all of these options to the . However WPS may not cope with the subsets if they cross longitude 0:

nlat=NN;  
slat=SS; 
wlon=WWW; 
elon=EEE; 

When requesting subsets, some WRF preset are available. Three presets available are:

WRF Model Input VTable.CFSR

  • 0.5×0.5 degree grid
  • 6-hour forecast
  • Levels:
    • mean sea level,
    • 2m,
    • 1,2,3,5,7,10,20,30,50,70,100,125,150,175,200,225,250,300,350,400,450,500,550,600,650,700,750,775,800,825,850,875,900,925,950,975,1000mb:
  • Variables
    • Geopotential height
    • Pressure reduced to MSL
    • Relative humidity
    • Temperature
    • U and V

Which corresponds to this <dataset_description> for ds093.0:

"dsid=ds093.0&rtype=S&rinfo=dsnum=093.0;
startdate=YYYY-MM-DD HH:MM;
enddate=YYYY-MM-DD HH:MM;
parameters=3%217-0.2-1:0.0.0,3%217-0.2-1:0.1.1,3%217-0.2-1:0.2.2,3%217-0.2-1:0.2.3,3%217-0.2-1:0.3.1,3%217-0.2-1:0.3.5;
level=76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,221,361,362,363,557,562,563,574,577,581,913,914,219;
product=3;
grid_definition=57;
ststep=yes"

And this description for ds094.0:

"dsid=ds094.0&rtype=S&rinfo=dsnum=094.0;
startdate=YYYY-MM-DD HH:MM;
enddate=YYYY-MM-DD HH:MM;
parameters=3%217-0.2-1:0.0.0,3%217-0.2-1:0.1.1,3%217-0.2-1:0.2.2,3%217-0.2-1:0.2.3,3%217-0.2-1:0.3.1,3%217-0.2-1:0.3.5;
level=76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,221,361,362,363,557,562,563,574,577,581,913,914,219;
product=3;
grid_definition=57;
ststep=yes"

WRF Model Input VTable.CSFR Surface

  • 0.313×0.313 degrees.
  • 6-hour forecast.
  • Levels:
    • ground or water surface,
    • layer between depth 0.4-0.1m,
    • layer between depth 0.1-0m,
    • layer between depth 2-1m,
    • layer between depth 1-0.4,
    • height above ground 2m,
    • height above ground 10m:
  • Variables
    • Pressure
    • Specific humidity
    • Temperature
    • U and V
    • Ice cover
    • Land cover
    • Volumetric soil water
    • Water equivalent of accumulated snow depth
  • Output files:
    • 201001010000.flxf06.gdas.20091226-20091231.grb2
    • Date prefix is valid time, date suffix is a relic of which .tar file they came from

Which corresponds to this <dataset_description> for ds093.0:

"dsid=ds093.0&rtype=S&rinfo=dsnum=093.0;
startdate=YYYY-MM-DD HH:MM;
enddate=YYYY-MM-DD HH:MM;
parameters=3%217-0.2-1:2.0.192,3%217-0.2-1:0.3.5,3%217-0.2-1:0.2.2,3%217-0.2-1:0.2.3,3%217-0.2-1:0.1.0,3%217-0.2-1:0.1.13,3%217-0.2-1:2.0.0,3%217-0.2-1:10.2.0,3%217-0.2-1:0.3.0,3%217-0.2-1:0.0.0;
level=521,522,523,524,107,223,221;
product=3;
grid_definition=62;
ststep=yes"

And this <dataset_descriprtion> for ds094.0:

dsid=ds094.0&rtype=S&rinfo=dsnum=094.0;
startdate=YYYY-MM-DD HH:MM;
enddate=YYYY-MM-DD HH:MM;
parameters=3%217-0.2-1:0.0.0,3%217-0.2-1:0.1.0,3%217-0.2-1:0.1.13,3%217-0.2-1:0.2.2,3%217-0.2-1:0.2.3,3%217-0.2-1:0.3.0,3%217-0.2-1:0.3.5,3%217-0.2-1:10.2.0,3%217-0.2-1:2.0.0,3%217-0.2-1:2.0.192;
level=107,221,521,522,523,524,223;
product=3;
grid_definition=68;
ststep=yes

WRF Model Input VTable.CSFR SST,

  • 0.313×0.313 degreee grid
  • 6-hour forecast,
  • Levels:
    • Surface
  • Variables:
    • Temperature
  • Output files:
    • 201001010000.flxf06.gdas.20091226-20091231.grb2

Which corresponds to this <dataset_description> for ds093.0:

"dsid=ds093.0&rtype=S&rinfo=dsnum=093.0;
startdate=YYYY-MM-DD HH:MM;
enddate=YYYY-MM-DD HH:MM;
parameters=3%217-0.2-1:0.0.0;
level=107;
product=3;
grid_definition=62;
ststep=yes"

And this for ds094.0:

dsid=ds094.0&rtype=S&rinfo=dsnum=094.0;
startdate=YYYY-MM-DD HH:MM;
enddate=YYYY-MM-DD HH:MM;
parameters=3%217-0.2-1:0.0.0;
level=107;
product=3;
grid_definition=68;
ststep=yes

If a subset request is successful, the return should be an html string, containing a description of the dataset which will contain, among other things,:

<pre>Request Summary:
Index    : <numeric_index>
ID       : <request_id>
Category : Subset Data
Status   : Queue
Dataset  : ds093.0
Title    : NCEP Climate Forecast System Reanalysis (CFSR) 6-hourly Products, January 1979 to December 2010
User     : <user>
Email    : <email>
Date     : 2013-08-27
Time     : 10:11:22

Once you submit a subset request to ucar, the data is queued to be subset on the server side. You can check the status of the request by doing:

curl -b <cookie_file> http://rda.ucar.edu/#ckrqst

But this returns a webpage which uses JavaScript to populate a table of requests and statuses, so you can’t parse the returned text to check the status of a job. You will, however, receive an email when the request has been processed, and the data will be put into a subdirectory:

http://rda.ucar.edu/#dsrqst/<request_id>

This directory will contain all the files you requested, and wget and curl based scripts to download them called,

curl.<request_id>.csh
wget.<request_id>.csh

These scripts can themselves be downloaded automatically using either:

curl -b <cookie_file_name> http://rda.ucar.edu/#dsrqst/<request_id>/curl.<request_id>.csh
wget --load-cookies <cookie_file_name> --post-data "<dataset_description>" http://rda.ucar.edu/#dsrqst/<request_id>/wget.<request_id>.csh

Then, finally, once you have these scripts, you can run them to fetch the actual grib files themselves.

./curl.<request_id>.csh <passwd>
./wget.<request_id>.csh <passwd>

The underlying grib files are either called flx (surface variables), or pgbl (pressure and soil levels). Since one timestep per file is requested, they are prefixed with the valid time. There is also a time-based suffix which relates to how they were stored on disk, but you can ignore that.

In a future post, I will build a simple python programme to automate the full procedure, and then deal with running WRF using the data.

WRF dependencies

Introduction

There are a lot of good instructions for compiling WRF out there, but one thing which is often glossed over are the dependencies.  The instructions might say “WRF needs NetCDF”. So you go hunting down NetCDF, and, because you don’t have root access, you try and compile it from source. But hang on, do you need it with HDF5 support? That will require HDF5, which in turn will require gzip, possibly szip. OpenDAP support? zlib and curl.  Do they need to be compiled statically? Shared? What’s the difference anyway? Why can’t I just download a single installation file and go and have a cup of tea? Why am I wasting my life trying to find how to set LD_LIBRARY_PATH to point to the right thing when I don’t even know what a linker is.

Dependencies in installation order:

zlib

This is most likely installed. Use locate zlib.h to find out where it is on your system

szip

Get szip source code. If you have root access compile it into somewhere central like /usr. If you don’t have root access, compile it into $HOME/usr

tar -xvzf szip-2.1.tar.gz
cd szip-2.1
./configure --prefix=/where/to/install
make
make check
make install

Jasper

Get the source code. If you have root access compile into /usr if not, compile into $HOME/usr. Should compile fairly easily:

./configure --prefix=/usr
make
install

HDF5

I had no success with the pre-built binaries, so get the HDF5 source code. If you’re installing on a cluster and have root access I would strongly reccomend installing it as a module. Create a directory for the module and install into there. If you don’t have root access you can install into your $HOME directory. I usually install into $HOME/usr.

mkdir /cm/shared/apps/hdf5/1.8.11
tar -xvzf hdf5-1.8.11.tar.gz 
cd hdf5-1.8.11 
./configure --prefix=/cm/shared/apps/hdf5/1.8.11 --enable-fortran --enable-cxx 
make check 
make install

Then make a module file which sets the HDF5 variable to point to the base of the installation i.e.

/cm/shared/modulefiles/hdf5/1.8.11

#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {

puts stderr "\tAdds HDF5 to your environment variables,"
}

module-whatis "adds HDF5 to your environment variables"

set              version              1.8.11
set              root                 /cm/shared/apps/hdf5/$version
append-path      PATH                 $root/bin
setenv           HDF5DIR              $root
setenv           HDF5INCLUDE          $root/include
setenv           HDF5LIB              $root/lib

NetCDF

Relax. Take a few deep breaths. Get the source code. Make sure you get the C interface, the Fortran interface and the C++ interface. This will give you full functionality with NCO. Build with NetCDF4 support, you will be glad of it later when you can compress your wrfout files with a single command. I made use of some hints

tar -xvzf netcdf-4.3.0.tar.gz#
cd netcdf-4.3.0
export CPPFLAGS=-I${HDF5}/include
export LDFLAGS=-L${HDF5}/lib
export LD_LIBRARY_PATH=${HDF5}/lib

./configure --prefix=/cm/shared/apps/netcdf/gcc/64/4.3.0/ --enable-netcdf4
make
make check
make install
export NETCDF=/cm/shared/apps/netcdf/gcc/64/4.3.0

Now compile the Fortran and C++ netcdf interfaces:

export CPPFLAGS="-I${HDF5}/include -I${NETCDF}/include" 
export LDFLAGS="-L${HDF5}/lib -L${NETCDF/lib}"
export LD_LIBRARY_PATH="${HDF5}/lib:${NETCDF}/lib" 
./configure --prefix=$NETCDF 
make 
make check 
make install

Create a modulefile similar to the HDF5 example, which sets the NETCDF environment vars.

Udunits

Not a dependency, but useful for NCO. Fairly straightforward to build. Code here. You can also make this a module, I’ll assume from now on the UDUNITS variable is set to the base of the installation.

NCO

Not a depenency for WRF, but a very useful tool set. NCO has its own nice little dependency tree! [NCO Source Code](http://nco.sourceforge.net/mwg-internal/de5fs23hu73ds/progress?id=iH6LwY1X+F , but there is help here

  • ANTLR 2.7.x (not version 3.x!) (required for ncap2 – you can get away without this)
  • GSL (desirable for ncap2)
  • netCDF (absolutely required)
  • OPeNDAP (enables network transparency)
  • UDUnits (allows dimensional unit conversion)

Then build with this:

export NETCDF4_ROOT=$NETCDF 
export CPPFLAGS="-DHAVE_NETCDF4_H -I${NETCDF}/include -I${UDUNITS}/include" 
export LDFLAGS="-L${NETCDF}/lib -L${UDUNITS}/lib" 
./configure --prefix=/cm/shared/apps/nco/gcc/4.3.2 --enable-netcdf4

Or (single command):

NETCDF4_ROOT=$NETCDF \
CPPFLAGS="-DHAVE_NETCDF4_H -I${NETCDF}/include -I${UDUNITS}/include -I${GSL}/include" \
LDFLAGS="-L${NETCDF}/lib -L${UDUNITS}/lib -L${GSL}/lib" \
./configure --prefix=/cm/shared/apps/nco/gcc/4.3.2 --enable-netcdf4

NCL

Not a dependency, but essential if you want nice pretty pictures. Thankfully, the pre-built binaries work very well. You can still install them as a module, just create a directory in the module structure and untar the binaries into that. The make a module file, and make sure it sets the NCARG_ROOT variable, as well as appending to PATH.

Getting ECMWF data

So, you’ve heard ‘ECWMF is far better than  NCEP’, and you want a piece of some ECMWF data action. Where do you get get it? How do you find it?.  Here’s a list to help you (me) out.

There are multiple routes to ECMWF data heaven. The most flexible and powerful, but most complicated is MARS. To use MARS fully you need the client, a piece of open-source software covered by a GPL licences, but costs £100. It is used to communicate with the data servers at ECMWF, and can subset data on the server side to reduce the volume of data needed to be transferred.  Quicker way to MARS is via  Web MARS.

Some links on MARS
www.ecmwf.int/services/computing/training/material/mars.pdf
www.ecmwf.int/publications/manuals/mars/practice/
www.ecmwf.int/publications/manuals/mars/practice/
www.ecmwf.int/services/archive/

The other route is the ECMWF data server. This gives you straight access, no software needed, no (not many) questions asked. It is actually just a linux server sitting outside the ECMWF firewall running MARS, but only allowing access to the disk archive, and not the tape archive.

To grab a dataset requires a request. Some fields used in the request are:

class ECMWF classification (od, rd, e4, …)
stream originating forecasting system(oper, wave, enfo, seas, …)
expver version of the experiment (01 operational, 11, aaaa)
domain area covered by the data (Global, Mediterranean, …)
system seasonal forecast operational system (1, 2, 3)
type type of field (an, fc, …)
levtype type of level (pl, ml, sfc, pt, pv)
levelist levels for the specified levtype (off if levtype=sfc)
param meteorological parameter (t, temperature, 130, 30.128)
number ensemble member (1, 2, …)
channel brightness temperature frequency band
diagnostic iteration sensitivity forecast products
frequency direction 2-d wave spectra products
product section,
latitude longitude
ocean products

An example of a MARS request is:


retrieve,
class = ei,
stream = oper,
expver = 1,
date = 20071201/to/20071231,
time = 00/06/12/18,
type = an,
levtype = sfc,
param = sd,
target = “era-int.200712.sd”

The request is used to build a MARS tree, which describes the data hierachy to work through.

Truncation is done before interpolation to save resources, and is done depending on the final grid specified. The mapping between the final grid and the resolution truncated to is:
Grid increment Truncation
2.5  ≤ Δ T63
1.5  ≤ Δ < 2.5 T106
0.6  ≤ Δ < 1.5 T213
0.4  ≤ Δ < 0.6 T319
0.3  ≤ Δ < 0.4 T511
0.15 ≤ Δ < 0.3 T799
0.09 ≤ Δ < 0.15 T1279
0.0   ≤ Δ < 0.09 T2047

To subset an area use the boundaries North/West/South/East.

Archived data is available to all registered users. Real-time data only to the big spenders!

WRF in Google Earth

Traditionally weather maps aimed to present an enormous array of data as a simple output: a sun, a sun behind a cloud, or some rain.  Or possibly a cloud getting blown along by a face.  That’s great, as what most people want to know is: will it rain or not?

But this hides is the beauty and complexity behind the weather, we are immersed in a big mass of swirling fluid, rising, falling, heating, cooling.  Enormous amounts of energy are transported around the world, air masses collide, jet streams arc and split. In the midst of this, occasionally everything stills, and there is hardly a breeze.  It’s one big never-ending dance, and it looks beautiful.

That’s why I’m trying to put forecast data into Google Earth, because you get a real feel for the fact the UK’s weather doesn’t end at the edge of the map.

Using NCL and some code from rpavlick, this is my first attempt to put a forecast into Google Earth. At the moment it is just a picture, but hopefully I’ll get a Google Earth gadget working soon.

Inaugural Boreas Blog

Another techie blogs about WRF, meteorology, weather forecasting and renewable energy.  Soon to be filled with pretty pictures, steal-able code, and daring tales of my attempts to to scientific computing from a windows laptop locked behind a corporate firewall.  You didn’t read about it here …