Previous: Remote access using Matlab   Up: Remote access programming tutorial toc   Next: Remote access using IDL

Remote access using python

The easiest way to use the Madrigal python remote data access API is to simply let the web interface write the script you need for you. Just choose the Access data pull-down menu and choose Create a command to download multiple exps. Then follow the instructions, and you will have the command you need to download whatever you want from Madrigal. Be sure to select python as the language you want to create the command with. You can choose to download files as they are in Madrigal in either column-delimited ascii, Hdf5, or netCDF4 formats, or you can choose the parameters yourself (including derived parameters), and optionally include filters on the data you get back.

This web interface will generate python commands using one of the following two Python scripts: globalDownload.py and globalIsprint.py. Use globalDownload.py if you want data as it is in Madrigal. Use globalIsprint.py to choose parameters and/or filters. These two scripts are documented below, for those who do not want to use the web interface to generate the needed arguments:

Finally, this page describes the script globalCitation.py. This script is used to create a permanent citation to a group of Madrigal files.

globalDownload.py globalIsprint.py globalCitation.py

globalDownload.py

 Usage:
 globalDownload.py --url=<Madrigal url> --outputDir=<output directory> \
   --user_fullname=<user fullname> --user_email=<user email> \
   --user_affiliation=<user affiliation> --format=<ascii,hdf5> [options]
 where:
 --url=<Madrigal url> - url to homepage of site to be searched
   (ie, http://madrigal.haystack.mit.edu/)
   This is required.
 --outputDir=<output directory> - the output directory to store all files in.  Default is to store
   all files in the same directory, and a number is added to the filename if a file might be overwritten.  Set
   --tree flag to store all files in the same directory structure they appear in Madrigal.  This
   allows all files to keep their original names. 
 --user_fullname=<user fullname> - the full user name (probably in quotes unless your name is
   Sting or Madonna)
 --user_email=<user email>
 --user_affiliation=<user affiliation> - user affiliation.  Use quotes if it contains spaces.
   
   --format=<ascii or hdf5>
 and options are:
 --startDate=<MM/DD/YYYY> - start date to filter experiments before.  Defaults to allow all experiments.
 --endDate=<MM/DD/YYYY> - end date to filter experiments after.  Defaults to allow all experiments.
 --inst=<instrument list> - comma separated list of instrument codes or names.  See Madrigal documentation
   for this list.  Defaults to allow all instruments. If names are given, the
   argument must be enclosed in double quotes.  An asterick will perform matching as
   in glob.  For example:
   
   --inst=10,30
   
   --inst="Jicamarca IS Radar,Arecibo*"
 --expName  - filter experiments by the experiment name.  Give all or part of the experiment name. Matching
   is case insensitive.  Default is no filtering by experiment name.
   
   --fileDesc - filter files using input file Description string and case-insensitive fnmatch
 --kindat=<kind of data list> - comma separated list of kind of data codes.  See Madrigal documentation
   for this list.  Defaults to allow all kinds of data.  If names are given, the
   argument must be enclosed in double quotes.  An asterick will perform matching as
   in glob. For example:
   
   --kindat=3001,13201
   
   --kindat="INSCAL Basic Derived Parameters,*efwind*,2001"
 --seasonalStartDate=<MM/DD> - seasonal start date to filter experiments before.  Use this to select only part of the
   year to collect data.  Defaults to Jan 1.  Example: 
   
   --seasonalStartDate=07/01 would only allow experiments after July 1st from each year.
 
   --seasonalEndDate=<MM/DD> - seasonal end date to filter experiments after.  Use this to select only part of the
   year to collect data.  Defaults to Dec 31.  Example: 
   
   --seasonalEndDate=10/31 would only allow experiments before Oct 31 of each year.
   
   --tree - add if you want to store the downloaded files in the same hierarchy as in Madrigal: 
   <YYYY/<instCode>/<experimentDir>.  Without --tree, stores all downloaded files in one directory.
   
   --includeNonDefault - if given, include realtime files when there are no default.  Default is to search only default files.
 --verbose - if given, print each file processed info to stdout.  Default is to run silently.
   
   Example:
   
   globalDownload.py --url=http://madrigal.haystack.mit.edu --outputDir=/tmp --user_fullname="Bill Rideout" 
   --user_email=brideout@haystack.mit.edu --user_affiliation=MIT --startDate=01/01/1998 
   --endDate=-01/30/1998 --inst=30

globalIsprint.py

 Usage:
 globalIsprint.py --url=<Madrigal url> --parms=<Madrigal parms> --output=<output file> \
   --user_fullname=<user fullname> --user_email=<user email> \
   --user_affiliation=<user affiliation> [options]
 where:
 --url=<Madrigal url> - url to homepage of site to be searched
   (ie, http://madrigal.haystack.mit.edu/)
   This is required.
 --parms=<Madrigal parms> - a comma delimited string listing the desired Madrigal parameters
   in mnemonic form.  (Example: gdalt,dte,te).  Data will be returned
   in the same order as given in this string. See
   http://madrigal.haystack.mit.edu/cgi-bin/madrigal/getMetadata and
   choose "Parameter code table" for all possible parameters
 --output=<output file or directory name> - the file or directory name to store the resulting data.
   If you give a file, all output will be stored in a single
   ascii file that you specify.
   Use a directory name if you want data stored as individual
   files, in either ascii, Hdf5, or netCDF4 formats.  To use
   this option, you must set a format in the optional format
   argument. File names will be based on file names in Madrigal.
   Hdf5 or netCDF4 formats only available from Madrigal 3.0 or 
   higher sites.
 --user_fullname=<user fullname> - the full user name (probably in quotes unless your name is
   Sting or Madonna)
 --user_email=<user email>
 --user_affiliation=<user affiliation> - user affiliation.  Use quotes if it contains spaces.
 and options are:
 --startDate=<MM/DD/YYY> - start date to filter experiments before.  Defaults to allow all experiments.
 --endDate=<MM/DD/YYY> - end date to filter experiments after.  Defaults to allow all experiments.
 --inst=<instrument list> - comma separated list of instrument codes or names.  See Madrigal documentation
   for this list.  Defaults to allow all instruments. If names are given, the
   argument must be enclosed in double quotes.  An asterick will perform matching as
   in glob.  Examples: (--inst=10,30 or --inst="Jicamarca IS Radar,Arecibo*")
   
   --format=<Hdf5 or netCDF4 or ascii> - format must be specified if output is a directory so that data is stored
   in individual files, one for each Madrigal file. Hdf5 or netCDF4 formats only 
   available from Madrigal 3.0 or higher sites.
 --expName  - filter experiments by the experiment name.  Give all or part of the experiment name. Matching
   is case insensitive and fnmatch characters * and ? are allowed.  Default is no filtering by 
   experiment name.
   
   --fileDesc - filter files by their file description string. Give all or part of the file description string. Matching
   is case insensitive and fnmatch characters * and ? are allowed.  Default is no filtering by 
   file description.
 --kindat=<kind of data list> - comma separated list of kind of data codes.  See Madrigal documentation
   for this list.  Defaults to allow all kinds of data.  If names are given, the
   argument must be enclosed in double quotes.  An asterick will perform matching as
   in glob. Examples: (--kindat=3001,13201 or 
   --kindat="INSCAL Basic Derived Parameters,*efwind*,2001")
 --filter=<[mnemonic] or [mnemonic1,[+-*/]mnemonic2]>,<lower limit1>,<upper limit1>[or<lower limit2>,<upper limit2>...]
   a filter using any measured or derived Madrigal parameter, or two Madrigal parameters either added,
   subtracted, multiplied or divided.  Each filter has one or more allowed ranges.  The filter accepts
   data that is in any allowed range.  If the Madrigal parameter value is missing, the filter will always
   reject that data.  Multiple filter arguments are allowed on the command line.  To skip either a lower
   limit or an upper limit, leave it blank.  Examples: (--filter=ti,500,1000  (Accept when 500 <= Ti <= 1000)
   or --filter=gdalt,-,sdwht,0,  (Accept when gdalt > shadowheight - that is, point in direct sunlight)
   or  --filter=gdalt,200,300or1000,1200 (Accept when 200 <= gdalt <= 300 OR 1000 <= gdalt <= 1200))
 --seasonalStartDate=<MM/DD> - seasonal start date to filter experiments before.  Use this to select only part of the
   year to collect data.  Defaults to Jan 1.  Example:
   (--seasonalStartDate=07/01) would only allow experiments after July 1st from each year.
 
   --seasonalEndDate=<MM/DD> - seasonal end date to filter experiments after.  Use this to select only part of the
   year to collect data.  Defaults to Dec 31.  Example: 
   (--seasonalEndDate=10/31) would only allow experiments before Oct 31 of each year.
 --showFiles - if given, show file names.  Default is to not show file names. Not used if format in <Hdf5, netCDF4).
 --showSummary - if given, summarize all arguments at the beginning.  Not used if format in <Hdf5, netCDF4).
   Default is to not show summary.
   
   --includeNonDefault - if given, include realtime files when there are no default.  Not used if format in <Hdf5, netCDF4).
   Default is to search only default files.
 --missing=<missing string> (defaults to "missing"). Not used if format in <Hdf5, netCDF4).
 --assumed=<assumed string> (defaults to "assumed"). Not used if format in <Hdf5, netCDF4).
 --knownbad=<knownbad string> (defaults to "knownbad"). Not used if format in <Hdf5, netCDF4).
 --verbose - if given, print each file processed info to stdout.  Default is to run silently.

globalCitation.py

The script globalCitation.py runs a global search through Madrigal data, and returns a permanent citation to the group of files.

 Usage:

        globalCitation.py  --user_fullname= --user_email= \
            --user_affiliation= --startDate=  --endDate= \
            inst=instrument list> [options]

        where:

        --user_fullname= - the full user name (probably in quotes unless your name is
                                          Sting or Madonna)

        --user_email=

        --user_affiliation= - user affiliation.  Use quotes if it contains spaces.

        --startDate= - start date to filter experiments before.  Defaults to allow all experiments.

        --endDate= - end date to filter experiments after.  Defaults to allow all experiments.

        --inst= - comma separated list of instrument codes or names.  See Madrigal documentation
                                   for this list.  Defaults to allow all instruments. If names are given, the
                                   argument must be enclosed in double quotes.  An asterick will perform matching as
                                   in glob.  Examples: (--inst=10,30 or --inst="Jicamarca IS Radar,Arecibo*")
                                   
        and options are:
        

        --expName  - filter experiments by the experiment name.  Give all or part of the experiment name. Matching
                     is case insensitive and fnmatch characters * and ? are allowed.  Default is no filtering by 
                     experiment name.
                     
        --excludeExpName - exclude experiments by the experiment name.  Give all or part of the experiment name. Matching
                     is case insensitive and fnmatch characters * and ? are allowed.  Default is no excluding experiments by 
                     experiment name.
                     
        --fileDesc - filter files by their file description string. Give all or part of the file description string. Matching
                     is case insensitive and fnmatch characters * and ? are allowed.  Default is no filtering by 
                     file description.

        --kindat= - comma separated list of kind of data codes.  See Madrigal documentation
                                       for this list.  Defaults to allow all kinds of data.  If names are given, the
                                       argument must be enclosed in double quotes.  An asterick will perform matching as
                                       in glob. Examples: (--kindat=3001,13201 or 
                                       --kindat="INSCAL Basic Derived Parameters,*efwind*,2001")


        --seasonalStartDate= - seasonal start date to filter experiments before.  Use this to select only part of the
                                year to collect data.  Defaults to Jan 1.  Example:
                                (--seasonalStartDate=07/01) would only allow experiments after July 1st from each year.

        
        --seasonalEndDate= - seasonal end date to filter experiments after.  Use this to select only part of the
                                    year to collect data.  Defaults to Dec 31.  Example: 
                                    (--seasonalEndDate=10/31) would only allow experiments before Oct 31 of each year.

        --includeNonDefault - if given, include realtime files when there are no default.  Default is to search only default files.  


The rest of this tutorial is for those who want to go beyond the automatically generated commands and write more advanced python applications that access Madrigal data.

This page describes the remote Python API, and gives some examples of using this API. These examples have been tested on both Windows and Linux, and require only access to the internet and python 2.3 to run. It is available for download here.

The remote Python API is organized in the same way as the Madrigal data model, from Instrument at the highest level, down to the level of data values. Readers who are not familiar with the Madrigal data model should read the material in that section before proceeding with this tutorial.

The basic object in the remote Python API is the MadrigalData, found in the madrigalWeb module. To initialize MadrigalData requires only the url of the home page on any Madrigal 2.3 (or above) site as an argument. Calling the methods of this object will return all possible information from one Madrigal site. The other objects in madrigalWeb are simply there to hold returned information - for example, the MadrigalExperiment object holds information about one experiment.

MadrigalData has the following methods:

See the Madrigal Python API reference guide for complete documentation.

Two applications written with the remote Python API follow. The first is a simple regression test that is run to test web services when Madrigal is installed. The second is a script that downloads realtime data from any desired Madrigal site.


Simple regression test

This simple script calls the following MadrigalData methods:

This example also shows how to get data from a different Madrigal site than the one you start with.

To use this regression test, cd to the examples directory in the installation directory, and type:

python exampleMadrigalWebServices.py


"""exampleMadrigalWebServices.py runs an example of the Madrigal Web Services interface
   for a given Madrigal server.

   usage:

   python exampleMadrigalWebServices.py 

"""

# $Id: exampleMadrigalWebServices.py 3984 2012-03-20 14:20:17Z brideout $

import madrigalWeb.madrigalWeb

# constants
user_fullname = 'Bill Rideout - automated test'
user_email = 'brideout@haystack.mit.edu'
user_affiliation = 'MIT Haystack'

madrigalUrl = 'http://madrigal.haystack.mit.edu'


testData = madrigalWeb.madrigalWeb.MadrigalData(madrigalUrl)



print 'Example of call to getAllInstruments'
instList = testData.getAllInstruments()
# print out Millstone
for inst in instList:
    if inst.code == 30:
        print (str(inst) + '\n')
        

print 'Example of call to getExperiments'
expList = testData.getExperiments(30, 1998,1,19,0,0,0,1998,1,22,0,0,0)
for exp in expList:
    # should be only one
    print (str(exp) + '\n')


print 'Example of call to getExperimentFiles'
fileList = testData.getExperimentFiles(expList[0].id)
for thisFile in fileList:
    if thisFile.category == 1:
        print (str(thisFile.name) + '\n')
        thisFilename = thisFile.name
        break
    
print 'Example of downloadFile - simple and hdf5 formats:'
result = testData.downloadFile(thisFilename, "/tmp/test.txt", 
                               user_fullname, user_email, user_affiliation, "simple")
result = testData.downloadFile(thisFilename, "/tmp/test.hdf5", 
                               user_fullname, user_email, user_affiliation, "simple")

print 'Example of simplePrint - only first 1000 characters printed'
result = testData.simplePrint(thisFilename, user_fullname, user_email, user_affiliation)
print result[:1000]
print

print 'Example of call to getExperimentFileParameters - only first 10 printed'
fileParms = testData.getExperimentFileParameters(thisFilename)
for i in range(10):
    print fileParms[i]
print


print 'Example of call to isprint (prints data)'
print(testData.isprint(thisFilename,
                       'gdalt,ti',
                       'filter=gdalt,500,600 filter=ti,1900,2000',
                       user_fullname, user_email, user_affiliation))


print 'Example of call to madCalculator (gets derived data at any time)'
result = testData.madCalculator(1999,2,15,12,30,0,45,55,5,-170,-150,10,200,200,0,'sdwht,kp')
for line in result:
    for value in line:
        print ('%8.2e ' % (value))
    print('\n')

print 'Example of searching all Madrigal sites for an experiment - here we search for PFISR data'
expList = testData.getExperiments(61,2008,4,1,0,0,0,2008,4,30,0,0,0,local=0)
print expList[0]

print 'Since this experiment is not local (note the experiment id = -1), we need to create a new MadrigalData object to get it'
testData2 = madrigalWeb.madrigalWeb.MadrigalData(expList[0].madrigalUrl)

print 'Now repeat the same calls as above to get PFISR data from the SRI site'
expList2 = testData2.getExperiments(61,2008,4,1,0,0,0,2008,4,30,0,0,0,local=1)
print 'This is a PFISR experiment'
print expList2[0]

Script to download realtime data from Madrigal

The following is a demonstration script that shows how real-time data can be imported from any Madrigal site that is updated on a real-time basis.

In this example, data is imported from http://www.haystack.mit.edu/madrigal from "Millstone Hill IS Radar". The following Madrigal parameters are retrieved:

year,month,day,hour,min,sec,gdlat,glon,gdalt,az,el,vo,dvo

for all records from the past 15 minutes.

Although the particular Madrigal site (http://www.haystack.mit.edu/madrigal), the instrument ("Millstone Hill IS Radar"), the parameters, and the times are hard-coded in this example, they could be easily be modified to be arguments.

To avoid missing data, we choose one parameter to be the filter parameter: vo. By filtering on this parameter, any "missing" values are filtered out.

To run this script requires the python Madrigal API be installed, which can be downloaded from http://www.haystack.edu/madrigal/madDownload.html.




import os,sys,os.path
import string
import time


import madrigalWeb.madrigalWeb


#constants
madrigalUrl = 'http://www.haystack.mit.edu/madrigal'
instrument = 'Millstone Hill IS Radar'

user_fullname = 'Put your name here!!!'
user_email = 'your@email.here'
user_affiliation = 'Put your affiliation here!!!'


# each line of data contains the following parameters
params = 'year,month,day,hour,min,sec,gdlat,glon,gdalt,azm,elm,vo,dvo'
filterParm = 'vo'
timeDelay = 15

# create the main object to get all needed info from Madrigal
madrigalObj = madrigalWeb.madrigalWeb.MadrigalData(madrigalUrl)


# these next few lines convert instrument name to code
code = None
instList = madrigalObj.getAllInstruments()
for inst in instList:
    if inst.name.lower() == instrument.lower():
        code = inst.code
        break

if code == None:
    raise ValueError, 'Unknown instrument %s' % (instrument)


# next, get a list of real time experiments in the last timeDelay minutes
startTime = time.gmtime(time.time() - timeDelay*60.0)
endTime = time.gmtime(time.time())


try:
    expList = madrigalObj.getExperiments(code, startTime[0],
                                     startTime[1],
                                     startTime[2],
                                     startTime[3],
                                     startTime[4],
                                     startTime[5],
                                     endTime[0],
                                     endTime[1],
                                     endTime[2],
                                     endTime[3],
                                     endTime[4],
                                     endTime[5])

except:
    raise ValueError, 'No realtime experiments found'


# assume there's only one realtime experiment, and get the file names
fileList = madrigalObj.getExperimentFiles(expList[0].id)

if len(fileList) == 0:
    raise ValueError, 'No realtime experiment files found'


# get data from each of the files
startDateStr = time.strftime('%m/%d/%Y', startTime)
startDateStr = ' date1=' + startDateStr
startTimeStr = time.strftime('%H:%M:%S', startTime)
startTimeStr = ' time1=' + startTimeStr
endDateStr = time.strftime('%m/%d/%Y', endTime)
endDateStr = ' date2=' + endDateStr
endTimeStr = time.strftime('%H:%M:%S', endTime)
endTimeStr = ' time2=' + endTimeStr

filterString = 'filter=%s,-1E30,1E30' % (filterParm) + startDateStr + startTimeStr + endDateStr + endTimeStr
for dataFile in fileList:
    result = madrigalObj.isprint(dataFile.name, params, filterString,
                                 user_fullname, user_email, user_affiliation)
    # make sure it succeeded
    if result.find('No records were selected') != -1:
        continue
    if result.find('****') != -1:
        continue
    print result
Previous: Remote access using Matlab   Up: Remote access programming tutorial toc   Next: Remote access using IDL