Transformation Tasks¶
ecoscope.platform.tasks.transformation ¶
Classes¶
BoundingBox ¶
Bases: BaseModel
Attributes¶
max_x
class-attribute
instance-attribute
¶
max_x: Annotated[float, AdvancedField(default=180.0, title='Max Longitude')] = 180.0
max_y
class-attribute
instance-attribute
¶
max_y: Annotated[float, AdvancedField(default=90.0, title='Max Latitude')] = 90.0
min_x
class-attribute
instance-attribute
¶
min_x: Annotated[float, AdvancedField(default=-180.0, title='Min Longitude')] = -180.0
min_y
class-attribute
instance-attribute
¶
min_y: Annotated[float, AdvancedField(default=-90.0, title='Min Latitude')] = -90.0
Coordinate ¶
RenameColumn ¶
Functions:¶
add_spatial_index ¶
add_spatial_index(gdf: Annotated[TrajectoryGDF | EventGDF | EventsWithDisplayNamesGDF, Field(description='The dataframe to add the spatial index to.')], groupers: Annotated[AllGrouper | UserDefinedGroupers, Field(description=' A list of groupers which may contain SpatialGroupers. If SpatialGroupers are present,\n additional indexes will be added to the `gdf` by taking a spatial join of each region\n in the SpatialGrouper, and adding the joined region name\n If no SpatialGroupers are present, this task will return the input `gdf` unchanged.\n This parameter is excluded from the generated RJSF because it should only be set\n programmatically in the `spec.yaml` file.\n Note also that the type of this parameter is `AllGrouper | UserDefinedGroupers` to allow\n passing a list of any type of Grouper from upstream tasks in the DAG; any elements of\n the list which are not SpatialGrouper will simply be ignored here.\n ', exclude=True)]) -> AnyGeoDataFrame
Source code in ecoscope/platform/tasks/transformation/_indexing.py
add_temporal_index ¶
add_temporal_index(df: Annotated[AnyDataFrame, Field(description='The dataframe to add the temporal index to.')], time_col: Annotated[str, Field(description='The name of the column containing time data.')], groupers: Annotated[AllGrouper | UserDefinedGroupers, Field(description=' A list of groupers which may contain TemporalGroupers. If TemporalGroupers are present,\n additional indexes will be added to the `df` by formatting the `time_col` according to\n the `index_name` attribute of each TemporalGrouper. If no TemporalGroupers are present,\n this task will return the input `df` unchanged. This parameter is excluded from the\n generated RJSF because it should only be set programmatically in the `spec.yaml` file.\n Note also that the type of this parameter is `AllGrouper | UserDefinedGroupers` to allow\n passing a list of any type of Grouper from upstream tasks in the DAG; any elements of\n the list which are not TemporalGroupers will simply be ignored here.\n ', exclude=True)], cast_to_datetime: Annotated[bool, AdvancedField(default=True, description='Whether to attempt casting `time_col` to datetime.')] = True, format: Annotated[str, AdvancedField(default=mixed, description=' If `cast_to_datetime=True`, the format to pass to `pd.to_datetime`\n when attempting to cast `time_col` to datetime. Defaults to "mixed".\n ')] = 'mixed') -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_indexing.py
apply_classification ¶
apply_classification(df: Annotated[AnyDataFrame, Field(description='The dataframe to classify.', exclude=True)], input_column_name: Annotated[str, Field(description='The dataframe column to classify.')], output_column_name: Annotated[str | SkipJsonSchema[None], Field(description='The dataframe column that will contain the classification values.')] = None, label_options: Annotated[DefaultLabels | CustomLabels, AdvancedField(default=None, description='Optional specification or formatting of classification values.')] = DefaultLabels(), classification_options: Annotated[ClassificationArgs, Field(description='Classification scheme and its arguments.')] = SharedArgs()) -> AnyDataFrame
Classifies a dataframe column using specified classification scheme.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DatFrame
|
The input data. |
required |
input_column_name
|
str
|
The dataframe column to classify. |
required |
output_column_name
|
str
|
The dataframe column that will contain the classification.
Defaults to " |
None
|
labels
|
list[str]
|
labels of bins, use bin edges if labels==None. |
required |
label_options
|
DefaultLabels | CustomLabels
|
Optional specification or formatting of classification values. |
DefaultLabels()
|
classification_options
|
Annotated[ClassificationArgs, Field(description='Classification scheme and its arguments.')]
|
Classification scheme and its arguments. See below: Applicable to equal_interval, natural_breaks, quantile, max_breaks & fisher_jenks: k (int): The number of classes required Applicable only to natural_breaks: initial (int): The number of initial solutions generated with different centroids. The best of initial results are returned. Applicable only to max_breaks: mindiff (float): The minimum difference between class breaks. Applicable only to std_mean: multiples (numpy.array): The multiples of the standard deviation to add/subtract from the sample mean to define the bins. anchor (bool): Anchor upper bound of one class to the sample mean. For more information, see https://pysal.org/mapclassify/api.html |
SharedArgs()
|
Returns:
| Type | Description |
|---|---|
AnyDataFrame
|
The input dataframe with a classification column appended. |
Source code in ecoscope/platform/tasks/transformation/_classification.py
apply_color_map ¶
apply_color_map(df: Annotated[AnyDataFrame, Field(description='The dataframe to apply the color map to.', exclude=True)], input_column_name: Annotated[str, Field(description='The name of the column with categorical values.')], colormap: Annotated[str | SkipJsonSchema[dict[ColorValue, HexColor]] | SkipJsonSchema[list[HexColor]], Field(description='A named matplotlib colormap.')] = 'viridis', output_column_name: Annotated[str | SkipJsonSchema[None], Field(description='The dataframe column that will contain the color values.')] = None) -> AnyDataFrame
Adds a color column to the dataframe based on the categorical values in the specified column.
Args:
dataframe (pd.DataFrame): The input dataframe.
column_name (str): The name of the column with categorical values.
colormap (str): Either a named mpl.colormap or a list of string hex values.
output_column_name (str): The dataframe column that will contain the classification.
Defaults to "
Returns: pd.DataFrame: The dataframe with an additional color column.
Source code in ecoscope/platform/tasks/transformation/_classification.py
apply_reloc_coord_filter ¶
apply_reloc_coord_filter(df: AnyGeoDataFrame, bounding_box: Annotated[BoundingBox | SkipJsonSchema[None], AdvancedField(default=BoundingBox(), description='Filter events to inside these bounding coordinates.')] = None, filter_point_coords: Annotated[list[Coordinate] | SkipJsonSchema[None], AdvancedField(default=[], title='Filter Exact Point Coordinates', description='By adding a filter, the workflow will not include events recorded at the specified coordinates.')] = None, roi_gdf: Annotated[AnyGeoDataFrame | SkipJsonSchema[None], AdvancedField(default=None, description='The ROI geopandas dataframe, in EPSG: 4326, indexed by ROI name')] = None, roi_name: Annotated[str | SkipJsonSchema[None], AdvancedField(default=None, description='The ROI name')] = None, reset_index: Annotated[bool | SkipJsonSchema[None], AdvancedField(default=True, description='Reset index after filtering')] = True) -> AnyGeoDataFrame
Source code in ecoscope/platform/tasks/transformation/_filtering.py
assign_subject_colors ¶
assign_subject_colors(df: AnyDataFrame, subject_id_column: Annotated[str, Field(description="Column containing subject identifiers (e.g., 'groupby_col', 'subject__id')")] = 'subject__id', additional_column: Annotated[str, Field(description="Column containing subject additional data as JSON (e.g., 'subject__additional')")] = 'subject__additional', output_column: Annotated[str, Field(description='Name of the output column for assigned colors')] = 'subject_color', fallback_strategy: Annotated[Literal['default_color', 'palette'], Field(description="Strategy for subjects with missing or duplicate rgb values: 'default_color' keeps original rgb (even duplicates) and uses default_color for missing; 'palette' assigns palette colors to both duplicates and missing")] = 'default_color', default_color: Annotated[str, AdvancedField(description="Hex color for subjects without rgb (used when fallback_strategy='default_color')", default='#FFFF00')] = '#FFFF00', default_palette: Annotated[str, AdvancedField(description="Color palette for fallback colors (used when fallback_strategy='palette')", default=tab20)] = 'tab20') -> AnyDataFrame
Assign colors to subjects based on rgb field from subject__additional JSON.
Strategy: 1. Parse rgb from subject__additional JSON field 2. Identify subjects with unique vs duplicate rgb values 3. Based on fallback_strategy: - 'default_color': Keep original rgb (even duplicates), use default_color for missing - 'palette': Only unique rgb kept, duplicates and missing get palette colors 4. Return dataframe with new color column
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
Input dataframe with subject observations |
required |
subject_id_column
|
Annotated[str, Field(description="Column containing subject identifiers (e.g., 'groupby_col', 'subject__id')")]
|
Column name containing subject identifiers |
'subject__id'
|
additional_column
|
Annotated[str, Field(description="Column containing subject additional data as JSON (e.g., 'subject__additional')")]
|
Column name containing JSON with rgb data |
'subject__additional'
|
output_column
|
Annotated[str, Field(description='Name of the output column for assigned colors')]
|
Name for the output color column |
'subject_color'
|
fallback_strategy
|
Annotated[Literal['default_color', 'palette'], Field(description="Strategy for subjects with missing or duplicate rgb values: 'default_color' keeps original rgb (even duplicates) and uses default_color for missing; 'palette' assigns palette colors to both duplicates and missing")]
|
Strategy for handling duplicates and missing rgb values |
'default_color'
|
default_color
|
Annotated[str, AdvancedField(description="Hex color for subjects without rgb (used when fallback_strategy='default_color')", default='#FFFF00')]
|
Hex color string for missing rgb (when fallback_strategy='default_color') |
'#FFFF00'
|
default_palette
|
Annotated[str, AdvancedField(description="Color palette for fallback colors (used when fallback_strategy='palette')", default=tab20)]
|
Matplotlib palette name for fallback colors (when fallback_strategy='palette') |
'tab20'
|
Returns:
| Type | Description |
|---|---|
AnyDataFrame
|
DataFrame with added color column |
Source code in ecoscope/platform/tasks/transformation/_subjects.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | |
assign_value ¶
assign_value(df: AnyDataFrame, column_name: Annotated[str, Field(description='The column name to map.')], value: Annotated[str | int | float | bool | SkipJsonSchema[None], Field(description='The column value.')], noop_if_column_exists: Annotated[bool, Field(description='If set to true and column_name exists on df, do nothing', default=False)] = False) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_mapping.py
classify_is_night ¶
classify_is_night(relocations: Annotated[AnyDataFrame, Field(description='The dataframe to classify.', exclude=True)]) -> AnyDataFrame
Classifies if segments occur at night in a trajectory dataframe
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DatFrame
|
The input data. |
required |
Returns:
| Type | Description |
|---|---|
AnyDataFrame
|
The input dataframe with a |
Source code in ecoscope/platform/tasks/transformation/_classification.py
classify_seasons ¶
classify_seasons(trajectory: AnyDataFrame, season_windows: AnyDataFrame) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_classification.py
convert_column_values_to_numeric ¶
convert_column_values_to_numeric(df: AnyDataFrame, columns: Annotated[list[str], Field(description='The columns to convert.')]) -> AnyDataFrame
Casts the values of the listed columns to numbers Values that cannot be casted will be converted to NaN
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame. |
required |
columns
|
list[str]
|
List of columns to cast. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The modified DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_conversion.py
convert_column_values_to_string ¶
convert_column_values_to_string(df: AnyDataFrame, columns: Annotated[list[str], Field(description='The columns to convert.')]) -> AnyDataFrame
Casts the values of the listed columns to type string None and NaN values will also be converted to string
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame. |
required |
columns
|
list[str]
|
List of columns to cast to string. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The modified DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_conversion.py
convert_crs ¶
convert_crs(df: AnyGeoDataFrame, crs: CrsAnnotation = 'EPSG:4326') -> AnyGeoDataFrame
Re-project a GeoDataFrame's geometries to the given CRS.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyGeoDataFrame
|
Input GeoDataFrame. Must have CRS metadata set. |
required |
crs
|
CrsAnnotation
|
Target CRS authority code (e.g. |
'EPSG:4326'
|
Returns:
| Type | Description |
|---|---|
AnyGeoDataFrame
|
GeoDataFrame with geometries re-projected to |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the input GeoDataFrame has no CRS metadata. |
Source code in ecoscope/platform/tasks/transformation/_crs.py
convert_values_to_timezone ¶
convert_values_to_timezone(df: AnyDataFrame, timezone: Annotated[str | tzinfo | TimezoneInfo, Field()], columns: Annotated[list[str], Field(description='The columns to convert.')], auto_detect: Annotated[bool, Field(description='Auto-detect all timezone-aware datetime columns to convert, ignoring the columns list.', exclude=True)] = False) -> AnyDataFrame
Converts the listed columns in the df to the timezone provided NOTE: Timezone naive timestamps are ignored Args: df (AnyDataFrame): The input DataFrame. timezone (str | datetime.tzinfo | TimezoneInfo): The timezone to convert to columns (list[str]): List of columns to cast to string. auto_detect (bool): If True, auto-detect all timezone-aware datetime columns, ignoring the columns list.
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The modified DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_conversion.py
drop_nan_values_by_column ¶
drop_nan_values_by_column(df: AnyDataFrame, column_name: Annotated[str, Field(description='The column to check')]) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_filtering.py
drop_null_geometry ¶
drop_null_geometry(gdf: AnyGeoDataFrame) -> AnyGeoDataFrame
explode ¶
explode(df: AnyDataFrame, column_name: Annotated[str, Field(description='The column name to explode.')], ignore_index: Annotated[bool, Field(description='Whether to ignore the index.')]) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_exploding.py
extract_column_as_type ¶
extract_column_as_type(df: Annotated[AnyDataFrame, Field(description='The dataframe.', exclude=True)], column_name: Annotated[str, Field(description='The column name to extract the value from.')], output_type: Annotated[FieldType, Field(description='The output type of the extracted value.')], output_column_name: Annotated[str, Field(description="The output column name to store the extracted value. If it's a pandas series, then the output_column_name will be the column prefix.")]) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_extract.py
extract_spatial_grouper_feature_group_names ¶
extract_spatial_grouper_feature_group_names(groupers: AllGrouper | UserDefinedGroupers) -> list[FeatureGroupId]
If there are spatial groupers, extract and return feature group names
Source code in ecoscope/platform/tasks/transformation/_indexing.py
extract_value_from_json_column ¶
extract_value_from_json_column(df: Annotated[AnyDataFrame, Field(description='The dataframe.', exclude=True)], column_name: Annotated[str, Field(description='The json column name to extract the value from.')], field_name_options: Annotated[list[str], Field(description='A list of field name options to extract the value from. The first field name that is found will be used.')], output_type: Annotated[FieldType, Field(description='The output type of the extracted value.')], output_column_name: Annotated[str, Field(description="The output column name to store the extracted value. If it's a pandas series, then the output_column_name will be the column prefix.")]) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_extract.py
fill_na ¶
fill_na(df: AnyDataFrame, value: Annotated[str | int | float | bool | SkipJsonSchema[None], Field(description='The value to fill.')], columns: Annotated[list[str] | SkipJsonSchema[None], Field(description='Provided columns will have nan values filled.')] = None) -> AnyDataFrame
Fill NA values the with the input value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame. |
required |
value
|
str | int | float | bool | None
|
The value to fill NaN with. |
required |
columns
|
list[str]
|
If provided, fill these column only. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The updated DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_mapping.py
filter_by_geometry_type ¶
filter_by_geometry_type(df: AnyGeoDataFrame, geometry_types: Annotated[list[str], Field(description="Shapely geometry type names to keep (e.g. ['Point'], ['Polygon', 'MultiPolygon']).")]) -> AnyGeoDataFrame
Filter a GeoDataFrame to rows whose geometry type is in geometry_types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyGeoDataFrame
|
Input GeoDataFrame. |
required |
geometry_types
|
Annotated[list[str], Field(description="Shapely geometry type names to keep (e.g. ['Point'], ['Polygon', 'MultiPolygon']).")]
|
List of shapely |
required |
Returns:
| Type | Description |
|---|---|
AnyGeoDataFrame
|
GeoDataFrame containing only rows whose geometry's |
AnyGeoDataFrame
|
|
Source code in ecoscope/platform/tasks/transformation/_filter_by_geometry_type.py
filter_df ¶
filter_df(df: Annotated[AnyDataFrame, Field(description='The dataframe.', exclude=True)], column_name: Annotated[str, Field(description='The column name to filter on.')], op: Annotated[ComparisonOperator, Field(description='The comparison operator')], value: Annotated[str, Field(description='The comparison operand')], reset_index: Annotated[bool, Field(description='If reset index, default is False')] = False) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_filter.py
lookup_string_var ¶
lookup_string_var(var: Annotated[str, Field(...)], value_map: Annotated[dict[str, str], Field(default={}, description='A dictionary of values.')], raise_if_not_found: Annotated[bool, Field(description='Whether or not to raise if var is not in value_map.')] = True) -> str
Lookup var in value_map and return the string mapped by var
If raise_if_not_found is true, raises KeyError if var is not in value_map
If raise_if_not_found is false, var is passed through unchanged
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
var
|
str
|
The input var. |
required |
value_map
|
dict[str, str]
|
The map to lookup |
required |
raise_if_not_found
|
bool
|
Whether or not to raise in the event |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The mapped value, or |
Raises:
KeyError: If var is not found in value_map.
Source code in ecoscope/platform/tasks/transformation/_mapping.py
map_columns ¶
map_columns(df: AnyDataFrame, drop_columns: Annotated[list[str] | SkipJsonSchema[None], AdvancedField(default=[], description='List of columns to drop.')] = None, retain_columns: Annotated[list[str] | SkipJsonSchema[None], AdvancedField(default=[], description='List of columns to retain with the order specified by the list.\n Keep all the columns if the list is empty.')] = None, rename_columns: Annotated[list[RenameColumn] | SkipJsonSchema[dict[str, str]] | SkipJsonSchema[None], AdvancedField(default={}, description='Dictionary of columns to rename.')] = None, raise_if_not_found: Annotated[bool, Field(description='Whether or not to raise if var is not in value_map.')] = True) -> AnyDataFrame
Maps and transforms the columns of a DataFrame based on the provided parameters. The order of the operations is as follows: drop columns, retain/reorder columns, and rename columns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame to be transformed. |
required |
drop_columns
|
list[str]
|
List of columns to drop from the DataFrame. |
None
|
retain_columns
|
list[str]
|
List of columns to retain. The order of columns will be preserved. |
None
|
rename_columns
|
dict[str, str]
|
Dictionary of columns to rename. |
None
|
raise_if_not_found
|
bool
|
Whether or not to raise in the event a column is not found. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The transformed DataFrame. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If any of the columns specified are not found in the DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_mapping.py
map_values ¶
map_values(df: AnyDataFrame, column_name: Annotated[str, Field(description='The column name to map.')], value_map: Annotated[dict[str, str], Field(default={}, description='A dictionary of values to map.')], missing_values: Annotated[Literal['preserve', 'remove', 'replace'], Field(default=remove, description="How to handle values that aren't in value_map.")], replacement: Annotated[str | SkipJsonSchema[None], Field(default=None, description='The replacement for values not in value_map.')] = None) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_mapping.py
map_values_with_unit ¶
map_values_with_unit(df: AnyDataFrame, input_column_name: Annotated[str, Field(description='The column name to map.')], output_column_name: Annotated[str, Field(description='The new column name.')], original_unit: Annotated[Unit | SkipJsonSchema[None], Field(description='The original unit of measurement.')] = None, new_unit: Annotated[Unit | SkipJsonSchema[None], Field(description='The unit to convert to.')] = None, decimal_places: Annotated[int, AdvancedField(default=1, description='The number of decimal places to display.')] = 1) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_mapping.py
normalize_json_column ¶
normalize_json_column(df: AnyDataFrame, column: Annotated[str, Field(description='The column name.')], skip_if_not_exists: Annotated[bool, AdvancedField(description='Skip if the column does not exist.', default=True)] = True, sort_columns: Annotated[bool, AdvancedField(description='Sort new columns alphabetically.', default=True)] = True) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_normalize.py
normalize_numeric_column ¶
normalize_numeric_column(df: AnyDataFrame, column: Annotated[str, Field(description='The column to normalize, values must be numeric.')], output_column_name: Annotated[str | None, Field(description='If provided, normalized values will be added as a new column.')]) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_normalize.py
reorder_columns ¶
reorder_columns(df: AnyDataFrame, columns: Annotated[list[str], Field(description='Provided column names will be first in the dataframe.')]) -> AnyDataFrame
Reorder columns in the provided dataframe to the order of the provided column names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame. |
required |
columns
|
list[str]
|
Provided column names will be first in the dataframe. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The updated DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_mapping.py
resolve_spatial_feature_groups_for_spatial_groupers ¶
resolve_spatial_feature_groups_for_spatial_groupers(groupers: AllGrouper | UserDefinedGroupers, spatial_feature_groups: list[RegionsGDF | EmptyDataFrame] | RegionsGDF | EmptyDataFrame) -> AllGrouper | UserDefinedGroupers
Resolves feature groups for SpatialGroupers, if necessary
Source code in ecoscope/platform/tasks/transformation/_indexing.py
sort_values ¶
sort_values(df: AnyDataFrame, column_name: Annotated[str, Field(description='The column name to sort values by.')], ascending: Annotated[bool, Field(description='Sort ascending if true')] = True, na_position: Annotated[Literal['first', 'last'], AdvancedField(description='Where to place NaN values in the sort', default=last)] = 'last') -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_sorting.py
strip_prefix_from_column_names ¶
strip_prefix_from_column_names(df: AnyDataFrame, prefix: Annotated[str, Field(description='The prefix to remove.')]) -> AnyDataFrame
Strip the provided prefix from column names that have it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame. |
required |
prefix
|
str
|
The prefix to remove from column names in this dataframe. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The updated DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_mapping.py
title_case_columns_by_prefix ¶
title_case_columns_by_prefix(df: AnyDataFrame, prefix: Annotated[str, Field(description='Column names prefixed with this value will be converted to title case.')]) -> AnyDataFrame
Convert the column names beginning with the provided prefix to title case.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
AnyDataFrame
|
The input DataFrame. |
required |
prefix
|
str
|
Column names prefixed with this value will be converted to title case. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
AnyDataFrame |
AnyDataFrame
|
The updated DataFrame. |
Source code in ecoscope/platform/tasks/transformation/_mapping.py
transpose ¶
transpose(df: AnyDataFrame, transposed_column_name: Annotated[str | SkipJsonSchema[None], Field(description='If provided, the transposed index will be a column with this name')] = None) -> AnyDataFrame
Source code in ecoscope/platform/tasks/transformation/_transpose.py
with_unit ¶
with_unit(value: Annotated[float, Field(description='The original value.')], original_unit: Annotated[Unit | SkipJsonSchema[None], Field(description='The original unit of measurement.')] = None, new_unit: Annotated[Unit | SkipJsonSchema[None], Field(description='The unit to convert to.')] = None) -> Annotated[Quantity, Field(description='The value with an optional unit.')]