Aggregation Constraint
Overview
Aggregation Constraints in Verteego allow for the enforcement of conditions on aggregated values of variables across specified groups within your data. These constraints are vital for managing totals, averages, or other statistical measures across various dimensions of a dataset, ensuring they adhere to specified business rules or operational limits.
Applications
Vehicle Inventory Management: The sum of vehicles in each category must not exceed the total available from suppliers.
Rental Operations: Total rental days per month must stay within a predefined range of minimum and maximum days.
Parameters
keys: Optional. A list of column names that define the grouping for aggregation. If omitted, all selected rows are aggregated into a single value.
Example: Using
keys=["car_model", "fuel_type"]
aggregates values separately for each combination of car model and fuel type.
where: Optional. A dictionary of key-value pairs to filter rows before applying the constraint.
Example:
where={"month": [1, 2, 3], "year": 2022}
limits aggregation to the first quarter of 2022.
left_method: Specifies the aggregation method for the left side (e.g.,
mean
,min
,max
,np.sum
). See the Pandas DataFrame aggregation documentation for more methods.left_column: Names of one or more columns to aggregate. Multiple columns result in the sum of aggregated values being constrained.
left_column_weight: Optional. Lists of weights to apply to each column that gets aggregated.
left_where: Optional. Filters rows for aggregation on the left side only.
Example:
left_where={"bicycle_type": ["push_bike", "tandem"]}
aggregates only rows concerning push bikes and tandem bikes.
operator: Specifies the type of constraint (e.g.,
equal
,lesser
,greater
,between
) between the aggregated results on the left and right sides.right_method, right_column, right_column_weight, right_where: These parameters mirror those on the left side but apply to the right side of the equation.
relax: Indicates whether this constraint can be relaxed with a penalty. Defaults to
false
.keep_duplicates: Determines whether to keep multiple rows that might refer to the same variables. Defaults to
false
.
Examples
Comparing Rental Prices for Different Bicycle Types
Checking Hotel Occupancy Limits
Last updated