Aggregation Constraint
Overview
Aggregation Constraints in Verteego allow for the enforcement of conditions on aggregated values of variables across specified groups within your data. These constraints are vital for managing totals, averages, or other statistical measures across various dimensions of a dataset, ensuring they adhere to specified business rules or operational limits.
Applications
Vehicle Inventory Management: The sum of vehicles in each category must not exceed the total available from suppliers.
Rental Operations: Total rental days per month must stay within a predefined range of minimum and maximum days.
Parameters
keys: Optional. A list of column names that define the grouping for aggregation. If omitted, all selected rows are aggregated into a single value.
Example: Using
keys=["car_model", "fuel_type"]
aggregates values separately for each combination of car model and fuel type.
where: Optional. A dictionary of key-value pairs to filter rows before applying the constraint.
Example:
where={"month": [1, 2, 3], "year": 2022}
limits aggregation to the first quarter of 2022.
left_method: Specifies the aggregation method for the left side (e.g.,
mean
,min
,max
,np.sum
). See the Pandas DataFrame aggregation documentation for more methods.left_column: Names of one or more columns to aggregate. Multiple columns result in the sum of aggregated values being constrained.
left_column_weight: Optional. Lists of weights to apply to each column that gets aggregated.
left_where: Optional. Filters rows for aggregation on the left side only.
Example:
left_where={"bicycle_type": ["push_bike", "tandem"]}
aggregates only rows concerning push bikes and tandem bikes.
operator: Specifies the type of constraint (e.g.,
equal
,lesser
,greater
,between
) between the aggregated results on the left and right sides.right_method, right_column, right_column_weight, right_where: These parameters mirror those on the left side but apply to the right side of the equation.
relax: Indicates whether this constraint can be relaxed with a penalty. Defaults to
false
.keep_duplicates: Determines whether to keep multiple rows that might refer to the same variables. Defaults to
false
.
Examples
Comparing Rental Prices for Different Bicycle Types
yamlCopy codeconstraint_electric_versus_manual_bicycles: constraint_type: aggregation keys: - bicycle_type left_column: rental_price left_method: max left_where: bicycle_type: - "push_bike" operator: lesser right_column: rental_price right_column_weight: - 0.9 right_method: min right_where: bicycle_type: - "e_bike"
Checking Hotel Occupancy Limits
yamlCopy codehotel_occupancy_checker: constraint_type: aggregation keys: - hotel_category left_column: hotel_occupancy left_method: mean operator: between right_column: - min_avg_occupancy_per_category - max_avg_occupancy_per_category right_method: min
Last updated