Harvard-Smithsonian Center for Astrophysics

Sloan Digital Sky Survey at the CFA Data Release 6

Spectral Data

In the database directory /data/astrocat/SDSS-dr6 are the SpecObjAll.db and SpecPhoto.db tables. These databases are a subset of the SpecObjAll view of the SDSS spectral database. The SpecObjAll.db table contains all the rows (but not all the columns) of the SpecObjAll view, while SpecPhoto.db rows are from the SpecPhoto view, where the rows are SciencePrimary, i.e. it eliminates rows that are flagged as poor quality or are duplicates from earlier data releases or do not have specPrimary == 1. The two databases have been indexed in RA and Dec for quicker searching for spectra near a particular coordinate.
The spectral reductions are from the Princeton/Sloan database, who redid the Sloan reductions. The Princeton info page is:
http://spectro.princeton.edu
There are 1,223,015 rows in the SpecPhoto.db table and 1,282,560 in the full set in SpecObjAll.db. These numbers from the Princeton data set are slightly different from those given by the Sloan server. The CFA database files are 489 MB and 512 MB, respectively.
The subset of columns is the same in both databases:

specObjID objID DR progName
ra dec primTarget secTarget specPrimary
plate mjd fiberID plqual plsn2
class subclass objType waveMin waveMax wcoverage
z z_err zwarning RChi2 RChi2Diff
vdisp vdisp_err vdispChi2 vdispZ vdispZ_err
firstRA firstDec firstWarning firstFInt firstFPeak firstRMS
firstMaj firstMin firstFPA firstSkyRMS firstMatchDist firstNmatch
tmassRa tmassDec firstErrMaj tmassErrMin tmassErrAng
tmassJ tmassJIvar tmassH tmassHIvar tmassK tmassKIvar
tmassPhQual tmassRdFlg tmassBlflg tmassCcFlg tmassGalContam
tmassMpflg tmassJdate tmassDupSrc tmassMatchDist tmassNmatch
usnoRA usnoDec usnoRmag usnoBmag usnoMatchDist unnoNmatch

The DR column gives which data release this row is from, i.e., one of: DR0, DR1, DR2, DR3, DR4 or DR6 (where I have renamed EDR to DR0).
Full information on the meaning of these columns is found at:
http://spectro.princeton.edu/#dm_spplate
See also their information on known problems. More information is also available from the SDSS web site:
http://cas.sdss.org/dr6/en/help/browser/browser.asp?n=SpecObj&t=U

Searching the databases
The row and search commands from the Starbase package can be used to search for rows with specific objects or values. The URL for the full Starbase documentation is:
http://cfa-www.harvard.edu/~john/starbase/starbase.html
Briefly, row is a front-end for awk statements where the column names substitute for $1, $2, etc., so the user does not have to know the column order of the database. The command to pull all primary observations from the SpecObjAll.db table is:
row 'specPrimary == 1 && primTarget != 0 && secTarget == 0' <SpecObjAll.db
Note that this does not eliminate the duplications among the data releases, so it is still somewhat larger than the SpecPhoto.db dataset.
To find all the objects with spectra near one or more coordinates, a small Starbase tab-separated table containing the RA and Dec is used containing the center(s) of the space to be searched. [see the discussion above] The example below is for one coordinate, but more are possible (the SDSS coordinates are all in degrees):
printf "ra\tdec\n-\t-\n190.\t50.\n" | [this makes the table] search SpecObjAll.db -S2ddd ra dec 0.05 | [this performs the search] justify ra%4f dec%4f class subclass z z_err objID [project & format output]
It happens that the search returns a single object, a starforming galaxy at Z = 0.08. The output from above is:
ra dec class subclass z z_err objID -------- ------- ------ ----------- ----- ----- ------------------ 189.9365 50.0143 GALAXY STARFORMING 0.084 0.000 587735430151929950
And, since the objID is non-zero, that means it is in the photometric database PhotoPrim.db also. To access that data row, you could row the entire database, testing for the correct objID, but that would take hours. So, use search again, and scan the much smaller output of the search:
printf "ra\tdec\n-\t-\n190.\t50.\n" | [this makes the table] search PhotoPrim.db -S2ddd ra dec 0.05 | [this performs the search] row 'objID == "587735430151929950"' | [select the object] justify ra%.4f dec%.4f r Err_r%.3f i Err_i%.3f isoA_r isoB_r isoPhi_r
returns:
ra dec r Err_r i Err_i isoA_r isoB_r isoPhi_r -------- ------- ------ ----- ------ ----- ------ ------ -------- 189.9365 50.0143 17.272 0.007 16.996 0.008 21.982 17.175 170.37
The search returns 165 rows in less than a second, so the row command has a far smaller table to check.
Actually, since the objID column is a secondary index, the PhotoPrim database should be directly searched:
search PhotoPrim.db -V objID 587735430151929950 | justify ra%.4f dec%.4f r Err_r%.3f i Err_i%.3f isoA_r isoB_r isoPhi_r
and the output is the same as before.

Joins between the spectral and photometric databases
The search program takes an input table, using one of its columns as values to search. The output rows can be joined with the input rows using the -j option to search.
For example, if the search in the SpecPhoto database is:
printf "233.45. 54.75." | fldtotable ra dec | search SpecPhoto.db -S2ddd ra dec 0.075 | justify specObjID objID ra%.4f dec%.4f class z z_err
then the output is:
specObjID objID ra dec class z z_err ------------------ ----------------- -------- ------- ------ ------ -------- 173055145871933440 588007004725117130 233.4319 54.7478 GALAXY 0.1179 0.000042 173613531751514112 588007004725117173 233.5103 54.7107 GALAXY 0.1531 0.000035 173613531772485632 588007004725117186 233.5444 54.7246 GALAXY 0.1173 0.000028
This can be combined into a search on these objects in the PhotoPrim.db database, yielding columns from both. If a column name is duplicated between the two databases (other than the join field), the input database columns have "_1" appended to them and the searched database columns have "_2" appended. In the example below, "z" in SpecPhoto is the redshift, but "z" in PhotoPrim is a Z-filter magnitude. Other columns such as ra and dec also conflict.
This pipeline of commands:
printf "233.45. 54.75." | fldtotable ra dec | search SpecPhoto.db -S2ddd ra dec 0.075 | search -j PhotoPrim.db objID | compute 'ra_2 /= 15. | justify ra_2=ra%.1@ dec_2=dec%.0@ class petroMag_r petroMag_i \ petroMag_r_i z_1=z%.3f zwarning=zw
generates:
ra dec class petroMag_r petroMag_i petroMag_r_i z zw ---------- -------- ------ ---------- ---------- ------------ ----- -- 15:33:43.6 54:44:52 GALAXY 17.455 16.944 0.499152 0.118 0 15:34:02.3 54:42:38 GALAXY 17.294 16.852 0.430647 0.153 0 15:34:10.6 54:43:28 GALAXY 17.565 17.17 0.38362 0.117 0
Note that the "z" from the input SpecPhoto.db table had been renamed to "z_1" because of the join conflict with "z" from PhotoPrim.db. The justify restored its original name and ignored the "z" magnitude from the PhotoPrim.db table, which became "z_2".

Obtaining the spectra
The spectra are stored in the /data/astrocat archive as multspec FITS files, one for each plate. The easiest way to extract one or more of them is to use getspecid.
The getspecid program uses specObjID values from one of three sources: on the command line, input on standard input as a Starbase table, or input as a simple list on standard input.
For example, the search of SpecPhoto.db above yields three objects:
specObjID ------------------ 173055145871933440 173613531751514112 173613531772485632
These can be extracted by piping this table into getspecid:
printf "233.45. 54.75." | fldtotable ra dec | search SpecPhoto.db -S2ddd ra dec 0.075 | getspecid -v
which prints:
Extract fiber 1 to ./spPlate-0614-53437-001.fits Extract fiber 353 to ./spPlate-0616-52374-353.fits Extract fiber 358 to ./spPlate-0616-52374-358.fits
while writing the files to the current directory. Without the "-v" option, it would silently write out the files. The FITS header is updated with the correct fiber RA and Dec, the source file name as "SPECFILE", the specObjID as "SPEC_ID" and the fiber ID as "FIBERID".
Sometimes it may be useful simply to extract the path to the file and fiber:
printf "233.45. 54.75." | fldtotable ra dec | search SpecPhoto.db -S2ddd ra dec 0.075 | getspecid -n
which prints:
/data/astrocat/SDSS-dr6/Spectro/spectro_DR6/0614/spPlate-0614-53437.fits 1 /data/astrocat/SDSS-dr6/Spectro/spectro_DR2/0616/spPlate-0616-52374.fits 353 /data/astrocat/SDSS-dr6/Spectro/spectro_DR2/0616/spPlate-0616-52374.fits 358
The following gives the details of the spectra storage hierachy. For those who want more details on the program, see the discussion and examples below.
To create the full path to one of the FITS files, use the DR, plate, mjd, and fiberID values. Using, for example, DR=="DR1", plate=="0649", mjd=="52201" and fiberid=="320", generates the full path which is
/data/astrocat/SDSS-dr6/Spectro/spectro_DR1/0649/spPlate-0649-52201.fits
To actually get the spectrum for fiberID = 320 into its own file, one possibility is to use the imcopy command in IRAF, e.g. to get the 320th spectrum from one of these FITS files (see below for HDU explanation):
cl> imcopy file.fits[0][*,320] mydir/my.fits
(where "file.fits" represents the actiual filename), or, using the funimage command from the FUNTOOLS package from a shell (single quotes prevent the shell from interpreting any characters as wildcards):
funimage file.fits'[0,*:*,320:320]' mydir/my.fits
The Princeton web site also has instructions on using IDL and SuperMONGO.
[ As of this writing, the SM links are broken due to a disk crash. Should be fixed ... soon? - WFW ]
In the Spectro subdirectory there is platelist.db, which lists all the plates from which release and MJD.
Each plate FITS file has 7 extensions, as documented at the Princeton web page.
HDU #0 = Flux in units of 10^(-17) erg/s/cm^2/Ang [FLOAT] HDU #1 = Inverse variance (1/sigma^2) for the above [FLOAT] HDU #2 = AND mask [32-bit INT] HDU #3 = OR mask [32-fit INT] HDU #4 = Wavelength dispersion in pixels [FLOAT] HDU #5 = Plug-map structure from plPlugMapM file [BINARY FITS TABLE] HDU #6 = Average sky flux in units of 10^(-17) erg/s/cm^2/Ang [FLOAT]
Note that the spectra are in log-wavelength. To compute the wavelengths of each bin use the header values CRVAL1 and CD1_1. Below is an IDL code snippet contributed by Jenny Greene:
pix1=sxpar(header, 'CRVAL1', OOPS) disp=sxpar(header, 'CD1_1', OOPS) l=fltarr(n_elements(respec)) for k=0,n_elements(respec)-1 do begin l[k]=disp*k+pix1 endfor l=10^l

[Introduction] [Photometric Data] [Spectral Data] [Spectral Line Data] [Tools & Examples]