Data Transfer, Verification, and Backup

  • The lifecycle of your data at the IRTF:

    When you take data using SpeX, CShell, or any other instrument, the data are written directly to /scrs1, the publically available data directory. Data taken with Apogee are currently written either to /scrs1 or /home/smokey/data.

    /scrs1 (and /home/smokey) are hosted on our file server. The data in /scrs1 are deleted from /scrs1 after they are 45 days old. /home/smokey/data is not currently part of this automatic compression/removal schedule. The entire file server disk is mirrored daily and the mirror is permanently archived to tape twice a month. This means that all data on /scrs1 are backed up to tape at least twice before being deleted.

  • Verifying data integrity using the md5sum.txt file in the /scrs1 data directories:

    When saving data with SpeX or MORIS, a checksum hash will be appended the file md5sum.txt. The checksum entry for the file looks like this:

    a3991d5cca9945469297d52c7c4a4833 arc-00119.a.fits

    When transferring your data, you should copy this md5sum.txt file and use it to verify the integrity of the files. The md5sum program is used to verify that the file has not changed as a result for a faulty file transfer. The md5sum program is installed on most linux and unix like OSes. (on MacOSX and BSD it may be called 'md5').

    Below is an example of sftp you data from stefan, and using md5sum to verify the transfer:

    myhost$ sftp 2015A999@stefan
    sftp> cd /scrs1/bigdog/2015A999/150514
    sftp> mget *
    sftp> bye

    myhost$ md5sum --quiet -c md5sum.txt

    The '--quiet' will suppress output for successfully verified files. If any files don't pass the checksum, you will see a message like:

    data-00006.a.fits: FAILED
    md5sum: WARNING: 1 of 10 computed checksums did NOT match

    If you get this error, trying re-copying the files from the IRTF server. Also remember if you re-wrote an file with the same name while observing, duplicate entries with the same filename may exist inside the md5sum.txt file.

    The summit is a harsh environment for people, as well as computers. We have found that the file server at the summit can produce hardware errors when saving or reading a file. We have seen something like one in 1000 files having a bit error.

  • Getting your data from the IRTF to your home institution

    There are two options for transferring data to your home institution:

    • Download the data to your laptop while at IRTF

      You can download your data to your laptop while at the IRTF by:

      • using ftp, sftp, scp, rsync
      • mounting /scrs1 as a virtual drive on your Windows laptop and copying the files over to your laptop

    • Transfer your data via the network to your home institution using sftp, scp, or rsync

      You can transfer your data to your home institution from the IRTF:

      • using ftp, sftp, scp, rsync if you are on the IRTF network. Our firewall does not block outgoing ftp or ssh connections.
      • using sftp, scp, rsync if you are outside the IRTF network. Our firewall blocks all incoming connections except ssh to specfic computers (stefan/boltzmann).

maintained by Miranda Hawarden-Ogata
Last modified May 14, 2015