Chapter 4 Waveforms & station files

4.1 Create waveform archive

To download a waveform archive, you should follow these steps:

  1. Activate the environment using:

conda activate obsrocko

  1. Set up the directories with the files used for Squirrel.

Squirrel is one of the applications within the Pyrocko package, installed in Chapter 2. Copy the folder 04_Waveforms_station_files, containing the isf file with the seismological catalogue of the North Sea (SHARP_North_Sea_Bulletin_v1.4.1.isf), config file in yaml format (config.yaml), isf file reader Python script (isf_reader.py) and automatic waveform download Python script (NS_downloadwaveforms_all.py). The mass downloader script for the waveforms downloading (NS_downloadwaveforms_all.py) and a config file for Squirrel were created and adapted to the North Sea catalogue by Daniela Kühn, with input from Evgeniia Martuganova and Tom Kettlety. A short description of the used files is summarised in the table below:

File Description
SHARP_North_Sea_Bulletin_v1.4.1.isf The most complete seismicity catalogue for the North Sea up to the current date. A detailed description can be found in the Deliverable report D2.1. In addition, see (Kettlety et al. 2025) to access the original data publication.
config.yaml The config file steers which sites to visit and which networks, stations and channels to request.
isf_reader.py A reader function is used to parse the data from the catalogue to the Squirrel search engine.
NS_downloadwaveforms_all.py A script to pull the data for the selected time period from the networks / stations / channels specified in the config file.
  1. Initiate a new database using the following command:

squirrel init

  1. Configure the config file to request the data.

squirrel update -d config/ns_bulletin_dataset_all.yaml –tmin 2000 –tmax 2022 –responses –promises &

  1. Start the data download using the command:

nohup python NS_downloadwaveforms_all.py &orsquirrel summon -d config/ns_bulletin_dataset.yaml –tmin “YYYY-MM-DD HH:MM” –tmax “yyyy-MM-DD HH:MM” &

The nohup command, “no hang up,” is used in Unix and Unix-like systems to run commands continuously even after the user logs out, in addition it appends the output of the commands into a convenient log file nohup.out. The & makes the command run in the background. The time boundaries tmin and tmax can be used to download the waveforms and the instrument responses.

  1. Revoke promises made Squirrel using the command:

squirrel remove promises -d config/ns_bulletin_dataset_all.yaml

If the data is written in the SeisComP database Squirrel will try to fill it out for a whole day. This will result in an enormous amount of data. To avoid this, we need to revoke the promises before the conversion of the squirrel database into the SeisComP database.

  1. Convert Squirrel database into a SeisComP database:

squirrel jacksei -d config/ns_bulletin_dataset_all.yaml –tmin 1990 –tmax 2023 –out-sds-pat/path_to_database

The folder structure of the waveforms archive.

Figure 4.1: The folder structure of the waveforms archive.

4.2 Create stations XML file

To generate Station XML file for all stations use the command:

squirrel stationxml -d config/ns_bulletin_dataset_all.yaml –on-error warn > stations.xml

xml files can be made for individual stations using:

squirrel stationxml -d config/ns_bulletin_dataset.yaml –on-error warn –codes ‘BE.BEBN..’ > BE.BEBN.xml

4.3 Split XML file by networks

Due to the possible issues with the instrument responses, we recommend splitting stations.xml by network code to debug the moment tensor inversion computations more easily. Open Create_xml_files_for_each_network.ipynb and run the cells to create separate XML files:

We begin with loading the required libraries:

import os from obspy.core.inventory import Inventory, Network, Station, Channel, Site

Next, we are going to set the working directory and the path to the stations.xml file:

#set the current working directory 
work_dir=os.getcwd()
#set the path to the xml file xml_file = "stations.xml" 
xml_file_path= events_path=os.path.join(work_dir,xml_file) 
#read stations xml file 
inv_full = obspy.core.inventory.read_inventory(xml_file_path)

Then, we will create separate inventory objects by selecting data from the specific network:

#read and separate the inventory by network codes:
inv_BE = inv_full.select(network='BE')
inv_DK = inv_full.select(network='DK')
inv_GB = inv_full.select(network='GB')
inv_GE = inv_full.select(network='GE')
inv_GR = inv_full.select(network='GR')
inv_KQ = inv_full.select(network='KQ')
inv_NO = inv_full.select(network='NO')
inv_NS = inv_full.select(network='NS')
inv_Z6 = inv_full.select(network='Z6')
inv_NL = inv_full.select(network='NL')

Writing the separate xml files for each of the networks:

#writing separate xml files for each network code 
inv_BE.write("stations.BE.xml", format="stationxml", validate=True) 
inv_GE.write("stations.GE.xml", format="stationxml", validate=True) 
inv_GB_sel.write("stations.GB.xml", format="stationxml", validate=True)
inv_GR.write("stations.GR.xml", format="stationxml", validate=True)
inv_KQ.write("stations.KQ.xml", format="stationxml", validate=True)
inv_NO.write("stations.NO.xml", format="stationxml", validate=True)
inv_NL.write("stations.NL.xml", format="stationxml", validate=True)
inv_NS.write("stations.NS.xml", format="stationxml", validate=True)
inv_Z6.write("stations.Z6.xml", format="stationxml", validate=True)

4.4 Select channels from the specific network

Some of the channel responses for the GB network have none values in them, which causes the incorrect run of the Grond inversion. This can be simply discovered by attempting to plot the instrument response for some of the channels from this network, for instance for station “KESW”:

#some of the stations responces in GB network have none values, this could be
checked by plotting the responce function, for instance for KESW channel
inv_GB.plot_response(0.001, station="KESW")

To avoid having none values in our file, we need to select “good” stations:

#To avoid eroneous values we need to subselect channels without none values 
GB_working_channel_list = ['BIGH', 'BRDL', 'CCA1', 'CLGH', 'DRUM', 'DYA', 'DRUM', 'EAB',
'EDMD', 'ELSH', 'EDMD', 'ELSH', 'GAL1', 'GAT2', 'GATW', 'GDLE', 'HMNX', 'HORS', 'HTL', 
'INVG', 'IOMK', 'JSA', 'LBWR', 'LEWI', 'MCD', 'MCH1','NOLA', 'NOLB','NOLC', 'NOLD', 
'NOLE', 'NOLF', 'OLDB', 'RUSH','SOFL', 'STAN', 'STRD', 'SWN1', 'WACR','WLF1'] 
GB_bad_channels_list = ['CWF', 'EDI', 'ELMS', 'ESK', 'FOEL', 'GAL1', 'HLM1', 'HPK', 'KESW', 
'KPL','LAWE', 'LINV', 'LMK', 'LRW', 'MONM', 'NEWG', 'PGB1', 'RSBS', 'STNC'] 
#we create a new inventory for the GB network, using only "good" channels 
inv_GB_sel = obspy.Inventory()
for cha in GB_working_channel_list:
  inv_GB_sel += inv_GB.select(station=cha)

After that, we can create an XML file the GB network:

inv_GB_sel.write("stations.GB.xml", format="stationxml", validate=True) 

References

Kettlety, Tom, Daniela Kühn, Annie Jerkins, Evgeniia Martuganova, Johannes Schweitzer, Cornelis Weemstra, Brian Baptie, Trine Dahl-Jensen, and Peter Voss. 2025. “North Sea Bulletin and Catalogue.” International Seismological Centre. https://doi.org/10.31905/5nsso6oc.