Daily data flow

This information is specifically for the 2010 campaign of Sara Martin et al.. The assumption is made that during this campaign, data for whole days will be captured, as opposed to capturing only those periods with very good seeing. This mode of operation places a large burden on storage, network and processing. We therefore want to outline strict rules how data should be handled during the day, to prevent an ever growing backlog of data from growing at the Dutch Open Telescope.

  1. Despeckling and frame selection
    1. Frame selection
    2. Despeckling
  2. Data storage
  3. Data transfer
    1. Cluster
    2. Internet
  4. Daily procedure

Despeckling and frame selection

The images taken by a solar telescope are highly variable. There are two causes for that: changes on the Sun itself, and changes in the Earth’s atmosphere. The observable changes on the Sun (with a 45 cm M1) are relatively slow, we assume that the image does not change significantly within a period of up to 15 seconds. The changes in the Earth’s atmosphere are very rapid, we assume that the image does not change significantly within a period of up to 10 milliseconds.

The DOT records bursts of 50 to 100 images at a time, with an interval of around 100 milliseconds between images in a burst. The idea is then to use statistics on the images in a burst to remove the effects of the Earth’s atmosphere on the original solar image. There are various techniques, at the DOT we primarily use two of them.

Frame selection

The simplest way to recover a “good” solar image from a burst of images perturbed by the Earth’s atmosphere is to pick that image which has the best contrast. The advantage of this method is that contrast detection is a very simple algorithm that can be run in real time, and discarding all but one of the images is an instant operation. The drawback is that it does not use the information of all images in a burst, for example if the “best” image still has unsharp patches, these are not replaced by sharp patches from the other images. The frame selection algorithm can also not detect and correct for geometric distortions still present in the best image. The signal to noise ratio of the resulting image is also the same as that of one exposure.

Frame selection should be used in bad and mediocre seeing conditions, as other algorithms do not give a significant improvement then.

Despeckling

A much better way that can recover “excellent” solar images is by using a despeckling algorithm. This algorithm analyzes all the images in a burst, and uses sophisticated statistics to estimate the point spread function (due to the Earth’s atmosphere) of each image. Then, each image is corrected for the estimated distortion, and then all images are combined to form a despeckled image. The advantage of this method is that the best parts of all images are combined together, resulting in an overall much sharper image than with frame selection. Geometric distortions are also corrected. The signal to noise ratio is much improved since the resulting image has an effective exposure time of that of all images combined. The drawback of this algorithm is that it takes a huge amount of processing time. Even with a cluster of computers, despeckling an hour worth of images takes approximately one week of processing.

Despeckling should be used in very good to excellent seeing conditions, as it will result in much better images than frame selection can provide.

Data storage

The DOT has two types of cameras:

Usually, all camera’s are synchronised to each other, and we operate at approximately 10 Hz. When taking images continuously, this results in the following raw data rates per camera (1 MiB = 220 bytes, 1 GiB = 230 bytes, and so on):

Data rates per hour:

When observing continously the whole day, assuming 10 hours of sunlight per day:

Assuming all science camera’s are used and ES4020 camera’s are run with 2x2 binning:

After upgrading the camera computers with the new harddisks, the total amount of storage at the DOT, including the RAID arrays of our cluster, is approximately 48 TB. It should be clear that if data is not quickly reduced or moved to other locations, storage will be depleted in a mere 8 days.

Data transfer

During the day, the network bandwidth of the DOT will be used for real time control and visualisation, and data cannot be transfered then. However, during the evening and night data can be transfered from the camera computers to the cluster or to off-site FTP servers.

Cluster

There is a gigabit network connection between the camera computers and the cluster, with a theoretical maximum throughput of 125 MiB/s, but we will assume 100 MiB/s in practice. This means it will take more than 13 hours to transfer a day’s worth of data to the cluster.

When observations is stopped at the end of the day, one must immediately decide whether it is a candidate for despeckling or not. If so, data transfer to the cluster must be started immediately so that it will be finished before the start of the observation the next day.

Internet

The Canary islands have an Internet connection with the mainland. The bandwidth is variable, but a reasonable figure is 5 MiB/s sustained. This means we cannot transfer unreduced data to an off-site server, since it would take 22 days to transfer one day’s worth of data. However, after frame selection or despeckling, data will be reduced approximately by a factor of 50 to 100, which means it will only take at most 7 hours to transfer one day’s worth of reduced data.

Transfer of reduced data must be started before the observers retire for the day.

Daily procedure

Starting when the observers arrive and had their first cup of coffee or tea:

Observers drink their final cup of tea and retire for the day.