Computing photometry and SEDs of galaxies in a diffsky mock

This notebook shows how to compute photometry through arbitrary bands of galaxies in a diffsky mock stored in the opencosmo format, and how to compute high-res SEDs.

Loading the mock data

The next two cells cells show how to load the mock files that are natively produced by the diffsky source code. For purposes of this demo, we will just work with a redshift slice, which will be downloaded in the next cell.

Notes for beta testers:

  1. You must be using opencosmo 1.1.0 or above and the opencosmo-utils branch of diffsky for this notebook to work correctly.

  2. This notebook will eventually use the cell below to download a small opencosmo-formatted mock catalog for testing/example purposes. For now, just ignore it. Instead, the notebook will be written as though you are working with one of synthetic catalogs on Perlmutter

[1]:
# ! wget -P ./download_dir -q -r -e robots=off -np -nH --cut-dirs=7 --reject "index.html*" https://portal.nersc.gov/project/hacc/aphearin/diffsky_data/smdpl_dr1_12_03_2025
[2]:
import numpy as np
from matplotlib import pyplot as plt
[3]:
from diffsky.data_loaders.opencosmo_utils import load_diffsky_mock
# Change to the directory of your favorite mock on Perlmutter
directory = "oc_download_dir"
[4]:
catalog, aux_data = load_diffsky_mock(directory)
/home/docs/checkouts/readthedocs.org/user_builds/diffsky/conda/latest/lib/python3.12/site-packages/diffsky/data_loaders/opencosmo_utils/load.py:62: UserWarning: Version checking is disabled (mode = None in versions.py). Photometry and SED calculations may not work quite right if your versions of the diffsky libraries do not match those used to generate this catalog.
  check_versions(catalog, version_check)

Inspect the mock data

The catalog data is returned as an OpenCosmo dataset. The OpenCosmo toolkit provides extensive tooling for querying and analyzing diffsky data, allowing you to select the specific types of objects you are interested in and prototype analyses on small samples before scaling them up. More details on the toolkit and its capabilities are available in the documentation.

Additionally, the load_diffsky_mock function returns auxilliary data which is used by diffsky to compute photometry or SEDs. The most important thing in this auxilliary data is the transmission curves, which determines which bands you can compute photometry for.

[5]:
# print("Transmission curves:", aux_data["tcurves"]._fields)
# print(catalog)
# # print(catalog.column

The full catalog contains millions of objects. We will select a much smaller subset for the purposes of this demo.

[6]:
catalog = catalog.with_redshift_range(0.0, 0.1).take(300)

Recompute mock photometry

You can use the compute_phot_from_diffsky_mock function to compute mock photometry for any of the available bands. This function will compute photometry for the lsst_i and lsst_u bands, and insert them into the catalog as “lsst_i_new” and “lsst_u_new” (to avoid clashing with the exisiting lsst_i and lsst_u columns). It will then check that the computed photometry is very close to the original photometry present in the catalog.

[7]:
from diffsky.data_loaders.opencosmo_utils import compute_phot_from_diffsky_mock
catalog = compute_phot_from_diffsky_mock(catalog, aux_data, bands = ["lsst_u", "lsst_i"], insert = True)
photometry = catalog.select(("lsst_u", "lsst_i", "lsst_u_new", "lsst_i_new")).get_data("numpy")
assert np.allclose(photometry['lsst_u'], photometry['lsst_u_new'], rtol=1e-3)
assert np.allclose(photometry['lsst_i'], photometry['lsst_i_new'], rtol=1e-3)

Recompute photometry of disk/bulge/knot decomposition

The photometry of each morphological component can be calculated using the compute_dbk_phot_from_diffsky_mock function.

[8]:
from diffsky.data_loaders.opencosmo_utils import compute_dbk_phot_from_diffsky_mock
catalog = compute_dbk_phot_from_diffsky_mock(catalog, aux_data, bands = ["lsst_u", "lsst_i"], insert = True)
photometry = catalog.select(("lsst_u_bulge", "lsst_i_bulge", "lsst_u_bulge_new", "lsst_i_bulge_new")).get_data("numpy")

assert np.allclose(photometry['lsst_u_bulge'], photometry['lsst_u_bulge_new'], rtol=1e-3)
assert np.allclose(photometry['lsst_i_bulge'], photometry['lsst_i_bulge_new'], rtol=1e-3)

Computing photometry in other bands

You can compute photometry in and band you like by simply providing new transmission curves

The next few cells show how to compute photometry through two fake bandpasses. We provide convinience functions for adding new transmission curves into the auxilliary data. This function expects tuples of (wavelength, tranmission_curve).

If you are interested in writing your updated data out in opencosmo format so you can come back to it later, you can do so with:

import opencosmo as oc
oc.write("path/to/output.hdf5", catalog)
[9]:
from jax.scipy.stats import norm as jnorm
from collections import namedtuple
from diffsky.data_loaders.opencosmo_utils import add_transmission_curves

wave = np.linspace(200, 8_000, 500)
fake_band_1 = jnorm.pdf(wave, loc=3_000.0, scale=500.0)
fake_band_2 = jnorm.pdf(wave, loc=5_000.0, scale=500.0)

# transmission curves will take their argument name
aux_data = add_transmission_curves(aux_data, fake_band_1 = (wave, fake_band_1), fake_band_2 = (wave, fake_band_2))

fig, ax = plt.subplots(1, 1)
__=ax.plot(wave, fake_band_1)
__=ax.plot(wave, fake_band_2)

# You should now see the new tranmission curves in the list of tranmission curves
print(aux_data["tcurves"]._fields)
('lsst_u', 'lsst_g', 'lsst_r', 'lsst_i', 'lsst_z', 'lsst_y', 'roman_F062', 'roman_F087', 'roman_F106', 'roman_F129', 'roman_F158', 'roman_F184', 'roman_F146', 'roman_F213', 'roman_Prism', 'roman_Grism_1stOrder', 'roman_Grism_0thOrder', 'fake_band_1', 'fake_band_2')
_images/demo_diffsky_recompute_from_mock_opencosmo_14_1.png
[10]:
from diffsky.data_loaders.opencosmo_utils import compute_dbk_phot_from_diffsky_mock
catalog = compute_phot_from_diffsky_mock(catalog, aux_data, bands = ["fake_band_1", "fake_band_2"], insert = True)
fake_magnitudes = catalog.select(("fake_band_1", "fake_band_2")).get_data("numpy")

fig, ax = plt.subplots(1, 1)

__=ax.scatter(fake_magnitudes["fake_band_1"], fake_magnitudes["fake_band_2"], s=1)
xlabel = ax.set_xlabel("fake band 1")
ylabel = ax.set_ylabel("fake band 2")
_images/demo_diffsky_recompute_from_mock_opencosmo_15_0.png

Compute SED of diffsky galaxies

There is also a convenience function for computing the high-resolution SED of each diffsky galaxy. Be aware this computation is memory-intensive, and it is recommended you only compute it for at most ~1000 galaxies at a time.

We can set a batch size to limit the number of galaxies that are computed at the same time. The photometry functions used above also support the batch_size argument. If the batch_size argument is not set, no batching gets performed.

OpenCosmo can handle multi-dimension columns (such as an SED). However for this example we will set insert = False which just directly returns the SED to us instead of inserting it into the catalog.

Notes for beta testers:

  • opencosmo 1.1 includes the ability to automatically batch this computation over subsets of a much larger catalog. For the moment, we will just select a smaller subset of this data for example purposes

  • This cell will ocassionally fail for reasons we are yet to track down, but seem to be related to the specific subset of rows selected in the take call. If this happens, just run the cell again to select a different sample of galaxies.

[11]:
from diffsky.data_loaders.opencosmo_utils import compute_seds_from_diffsky_mock
sed_catalog = catalog.take(50)
seds = compute_seds_from_diffsky_mock(sed_catalog, aux_data, insert=False, batch_size=25)
[12]:
fig, ax = plt.subplots(1, 1)
__=ax.loglog()
xlim = ax.set_xlim(2_000, 200_000)
ylim = ax.set_ylim(1e-8, 1e-3)

igal = 5
__=ax.plot(aux_data['ssp_data'].ssp_wave, seds["rest_sed"][igal, :])
xlabel = ax.set_xlabel('wavelength [angstroms]')
ylabel = ax.set_ylabel('Lsun/Hz')
_images/demo_diffsky_recompute_from_mock_opencosmo_18_0.png

Compute SED of disk/bulge/knot components

There is an additional convenience function for computing the high-resolution SED of each morphological component.

Note to beta testers: This tool is not currently working

[13]:
# from diffsky.data_loaders.opencosmo_utils import compute_dbk_seds_from_diffsky_mock
# dbk_sed_info = compute_dbk_seds_from_diffsky_mock(sed_catalog, aux_data, bands = ["lsst_u", "lsst_i"], insert=False)
[14]:
# fig, ax = plt.subplots(1, 1)
# __=ax.loglog()
# xlim = ax.set_xlim(2_000, 200_000)
# ylim = ax.set_ylim(1e-8, 1e-4)

# igal = 50
# __=ax.plot(diffsky_lc_patch['ssp_data'].ssp_wave, dbk_sed_info['rest_sed_bulge'][igal, :])
# __=ax.plot(diffsky_lc_patch['ssp_data'].ssp_wave, dbk_sed_info['rest_sed_disk'][igal, :])
# __=ax.plot(diffsky_lc_patch['ssp_data'].ssp_wave, dbk_sed_info['rest_sed_knots'][igal, :])
# xlabel = ax.set_xlabel('wavelength [angstroms]')
# ylabel = ax.set_ylabel('Lsun/Hz')
[ ]: