Reading a TimeSeries from GWF files

Gravitational-wave frame (GWF) files are archived on disk by the LIGO Data Grid, providing direct access to collaboration members at a number of shared computing centres. These files are indexed and accessible using the datafind service:

Finding data frames

Warning

Finding data frames requires the glue python package be installed on your system

The glue.datafind module provides a interface to the indexing service used to record the location on disk of all GWF files. This package is complemented by the command-line tool gw_data_find.

For any user wishing to read detector data from files on disk, they can login to a shared computing centre and run the following to locate the files:

>>> from glue import datafind
>>> connection = datafind.GWDataFindHTTPConnection()
>>> cache = connection.find_frame_urls('L', 'R', 1067042880, 1067042900, urltype='file')

i.e. open a connection to the server, and query for a set of frame URLs. This query required the following arguments:

L Single-character observatory identifier, L for the LIGO Livingston Observatory.
R Single-character frame data type, R refers to the ‘raw’ set of channels whose data are archived.
1067042880 GPS start time, any GPS integer is acceptable.
1067042900 GPS end time.
urltype='file' File scheme restriction, both gsiftp and file scheme paths are returned by default.

and returns a Cache object, a list of CacheEntry reprentations of individual frame files:

>>> for ce in cache:
>>>     print(ce)
L R 1067042880 32 file://localhost/archive/frames/A6/L0/LLO/L-R-10670/L-R-1067042880-32.gwf

Frame reading

Warning

Reading data from GWF files requires that either the frameCPP or lalframe packages (including SWIG bindings for Python) are installed on your system.

The above Cache can be passed into the TimeSeries.read(), to extract data for a specific Channel:

>>> from gwpy.timeseries import TimeSeries
>>> data = TimeSeries.read(cache, 'L1:PSL-ODC_CHANNEL_OUT_DQ')

The TimeSeries.read() classmethod will accept any of the following as its first argument:

while the second argument should always be a Channel, or simply a channel str name. Optional start and end keyword arguments can be given to restrict the returned data span.

As part of the unified input/output system, the TimeSeries.read() method is documented as follows:

Reading multiple channels

Normally, each frame file holds the data for more than one channel over a single GPS epoch. Any user can read all channels of interest, assuming they all exist in a single GWF file, using the TimeSeriesDict object, and its read() classmethod:

>>> from gwpy.timeseries import TimeSeriesDict
>>> datadict = TimeSeriesDict.read(cache, ['L1:PSL-ISS_PDA_OUT_DQ', 'L1:PSL-ISS_PDB_OUT_DQ'])
>>> print(datadict.keys())
['L1:PSL-ISS_PDA_OUT_DQ', 'L1:PSL-ISS_PDB_OUT_DQ'])

The output is an OrderedDict of (name, TimeSeries) pairs as read from the cache.

A note on frame libraries

GWpy takes its ability to read GWF-format files from one of the two available GWF I/O libraries:

lalframe The LALSuite frame I/O library, built on top of the core FrameL library.
frameCPP A stand-alone C++ library with python wrappings built using SWIG.

Each of these provide much the same functionality, with one caveat:

Note

Only when using format='framecpp' format can GWpy extract multiple TimeSeries from a single file without opening it multiple times; when using the format='lalframe' each TimeSeries is read by re-opening the given frame-file. As a result, when available the 'framecpp' format specifier is the default.