lsdb.dask.join_catalog_data#

Module Contents#

Functions#

rename_columns_with_suffixes(left, right, suffixes)

Renames two dataframes with the suffixes specified

perform_join_on(left, right, right_margin, left_pixel, ...)

Performs a join on two catalog partitions

perform_join_through(left, right, right_margin, ...)

Performs a join on two catalog partitions through an association catalog

join_catalog_data_on(...)

Joins two catalogs spatially on a specified column

join_catalog_data_through(...)

Joins two catalogs with an association table

Attributes#

NON_JOINING_ASSOCIATION_COLUMNS

NON_JOINING_ASSOCIATION_COLUMNS = ['Norder', 'Dir', 'Npix', 'join_Norder', 'join_Dir', 'join_Npix'][source]#
rename_columns_with_suffixes(left: pandas.DataFrame, right: pandas.DataFrame, suffixes: Tuple[str, str])[source]#

Renames two dataframes with the suffixes specified

Parameters:
  • left (pd.DataFrame) – the left dataframe to apply the first suffix to

  • right (pd.DataFrame) – the right dataframe to apply the second suffix to

  • suffixes (Tuple[str, str]) – the pair of suffixes to apply to the dataframes

Returns:

A tuple of (left, right) updated dataframes with their columns renamed

perform_join_on(left: pandas.DataFrame, right: pandas.DataFrame, right_margin: pandas.DataFrame, left_pixel: hipscat.pixel_math.HealpixPixel, right_pixel: hipscat.pixel_math.HealpixPixel, right_margin_pixel: hipscat.pixel_math.HealpixPixel, left_structure: hipscat.catalog.Catalog, right_structure: hipscat.catalog.Catalog, right_margin_structure: hipscat.catalog.Catalog, left_on: str, right_on: str, suffixes: Tuple[str, str], right_columns: List[str])[source]#

Performs a join on two catalog partitions

Parameters:
  • left (pd.DataFrame) – the left partition to merge

  • right (pd.DataFrame) – the right partition to merge

  • right_margin (pd.DataFrame) – the right margin partition to merge

  • left_pixel (HealpixPixel) – the HEALPix pixel of the left partition

  • right_pixel (HealpixPixel) – the HEALPix pixel of the right partition

  • right_margin_pixel (HealpixPixel) – the HEALPix pixel of the right margin partition

  • left_structure (hc.Catalog) – the hipscat structure of the left catalog

  • right_structure (hc.Catalog) – the hipscat structure of the right catalog

  • right_margin_structure (hc.Catalog) – the hipscat structure of the right margin catalog

  • left_on (str) – the column to join on from the left partition

  • right_on (str) – the column to join on from the right partition

  • suffixes (Tuple[str,str]) – the suffixes to apply to each partition’s column names

  • right_columns (List[str]) – the columns to include from the right margin partition

Returns:

A dataframe with the result of merging the left and right partitions on the specified columns

perform_join_through(left: pandas.DataFrame, right: pandas.DataFrame, right_margin: pandas.DataFrame, through: pandas.DataFrame, left_pixel: hipscat.pixel_math.HealpixPixel, right_pixel: hipscat.pixel_math.HealpixPixel, right_margin_pixel: hipscat.pixel_math.HealpixPixel, through_pixel: hipscat.pixel_math.HealpixPixel, left_catalog: hipscat.catalog.Catalog, right_catalog: hipscat.catalog.Catalog, right_margin_catalog: hipscat.catalog.Catalog, assoc_catalog: hipscat.catalog.AssociationCatalog, suffixes: Tuple[str, str], right_columns: List[str])[source]#

Performs a join on two catalog partitions through an association catalog

Parameters:
  • left (pd.DataFrame) – the left partition to merge

  • right (pd.DataFrame) – the right partition to merge

  • right_margin (pd.DataFrame) – the right margin partition to merge

  • through (pd.DataFrame) – the association column partition to merge with

  • left_pixel (HealpixPixel) – the HEALPix pixel of the left partition

  • right_pixel (HealpixPixel) – the HEALPix pixel of the right partition

  • right_margin_pixel (HealpixPixel) – the HEALPix pixel of the right margin partition

  • through_pixel (HealpixPixel) – the HEALPix pixel of the association partition

  • left_catalog (hc.Catalog) – the hipscat structure of the left catalog

  • right_catalog (hc.Catalog) – the hipscat structure of the right catalog

  • right_margin_catalog (hc.Catalog) – the hipscat structure of the right margin catalog

  • assoc_catalog (hc.AssociationCatalog) – the hipscat structure of the association catalog

  • suffixes (Tuple[str,str]) – the suffixes to apply to each partition’s column names

  • right_columns (List[str]) – the columns to include from the right margin partition

Returns:

A dataframe with the result of merging the left and right partitions on the specified columns

join_catalog_data_on(left: lsdb.catalog.catalog.Catalog, right: lsdb.catalog.catalog.Catalog, left_on: str, right_on: str, suffixes: Tuple[str, str]) Tuple[dask.dataframe.core.DataFrame, lsdb.types.DaskDFPixelMap, hipscat.pixel_tree.PixelAlignment][source]#

Joins two catalogs spatially on a specified column

Parameters:
  • left (lsdb.Catalog) – the left catalog to join

  • right (lsdb.Catalog) – the right catalog to join

  • left_on (str) – the column to join on from the left partition

  • right_on (str) – the column to join on from the right partition

  • suffixes (Tuple[str,str]) – the suffixes to apply to each partition’s column names

Returns:

A tuple of the dask dataframe with the result of the join, the pixel map from HEALPix pixel to partition index within the dataframe, and the PixelAlignment of the two input catalogs.

join_catalog_data_through(left: lsdb.catalog.catalog.Catalog, right: lsdb.catalog.catalog.Catalog, association: lsdb.catalog.association_catalog.AssociationCatalog, suffixes: Tuple[str, str]) Tuple[dask.dataframe.core.DataFrame, lsdb.types.DaskDFPixelMap, hipscat.pixel_tree.PixelAlignment][source]#

Joins two catalogs with an association table

Parameters:
  • left (lsdb.Catalog) – the left catalog to join

  • right (lsdb.Catalog) – the right catalog to join

  • association (AssociationCatalog) – the association catalog to join the catalogs with

  • suffixes (Tuple[str,str]) – the suffixes to apply to each partition’s column names

Returns:

A tuple of the dask dataframe with the result of the join, the pixel map from HEALPix pixel to partition index within the dataframe, and the PixelAlignment of the two input catalogs.