ecoscope.analysis.classifier ============================ .. py:module:: ecoscope.analysis.classifier Module Contents --------------- .. py:type:: ColorValue :canonical: str | float .. py:type:: HexColor :canonical: str .. py:data:: classification_methods .. py:function:: apply_classification(dataframe, input_column_name, output_column_name = None, labels = None, scheme = 'natural_breaks', label_prefix = '', label_suffix = '', label_ranges = False, label_decimals = 1, **kwargs) Classifies the data in a DataFrame column using specified classification scheme. Args: dataframe (pd.DatFrame): The data. input_column_name (str): The dataframe column to classify. output_column_names (str): The dataframe column that will contain the classification. Defaults to "_classified" labels (list[str]): labels of bins, use bin edges if labels==None. scheme (str): Classification scheme to use [equal_interval, natural_breaks, quantile, std_mean, max_breaks, fisher_jenks] label_prefix (str): Prepends provided string to each label label_suffix (str): Appends provided string to each label label_ranges (bool): Applicable only when 'labels' is not set If True, generated labels will be the range between bin edges, rather than the bin edges themselves. label_decimals (int): Applicable only when 'labels' is not set Specifies the number of decimal places in the label **kwargs: Additional keyword arguments specific to the classification scheme, passed to mapclassify. See below Applicable to equal_interval, natural_breaks, quantile, max_breaks & fisher_jenks: k (int): The number of classes required Applicable only to natural_breaks: initial (int): The number of initial solutions generated with different centroids. The best of initial results are returned. Applicable only to max_breaks: mindiff (float): The minimum difference between class breaks. Applicable only to std_mean: multiples (numpy.array): The multiples of the standard deviation to add/subtract from the sample mean to define the bins. anchor (bool): Anchor upper bound of one class to the sample mean. For more information, see https://pysal.org/mapclassify/api.html Returns: The input dataframe with a classification column appended. .. py:function:: apply_color_map(dataframe, input_column_name, cmap, output_column_name = None) Creates a new column on the provided dataframe with the given cmap applied over the specified input column Args: dataframe (pd.DatFrame): The data. input_column_name (str): The dataframe column who's values will be inform the cmap values. cmap (str, list, dict): Either a named mpl.colormap, a list of string hex values, or a dict mapping values to hex color strings. When a dict is provided, each key is a data value and each value is a hex color string (e.g. {"stop": "#FF0000", "go": "#00FF00"}). Data values not present in the dict are given set as fully transparent. output_column_name(str): The dataframe column that will contain the classification. Defaults to "_colormap" Returns: The input dataframe with a color map appended. .. py:function:: classify_percentile(df, percentile_levels, input_column_name, output_column_name = 'percentile') Creates a new column on the provided dataframe with the percentile bin of the input_column Uses much the same methodology as `get_percentile_area` but applies generally to a numeric dataframe column instead of a raster grid Args: df (pd.DataFrame | gpd.GeoDatFrame): The data. percentile_levels (list[int]): list of k-th percentile scores. input_column_name (str): The column to apply classification to. output_column_name (str): The dataframe column that will contain the classification. Defaults to "percentile" Returns: The input dataframe with percentile classification appended.