Jinwoo Je

Please note: Specific program names, internal tool references, and other potentially sensitive details mentioned in the original project document have been generalized or masked in this summary for confidentiality.

Context and Business Problem

Effective forecasting is crucial for optimizing logistics networks. The S&OP team is responsible for generating the total attainable volume forecast, a key input for short-to-mid-term planning horizons. This forecast, produced weekly for a multi-week outlook, traditionally utilized the standard planning methodology. This methodology employs a top-down approach, breaking down network-level inputs into station-level forecasts using metrics regional demand share, attainment, typically relying on simple multi-week averaging models.

However, the standard logic required adaptation to meet specific business needs, such as aligning with package-level, carrier-share-aware asks from transportation partners rather than just fulfillment center ship plans. The existing system lacked an intrinsic mechanism to account for carrier share dynamics. Furthermore, the basic averaging models for key metrics like volume share and Attainment were identified as areas for improvement to boost station-level forecast accuracy. While an enhanced volume share model was developed, a more sophisticated solution was needed for attainment forecasting.

Attainment, in this context, represents the percentage of volume a 1P carrier wins relative to the total demand directed towards its jurisdiction, considering factors like capacity constraints. Station-level attainment varies significantly due to capacity, network connections, delivery speed promises, and critically, cost considerations managed by transportation procurement partners. Transportation procurement actively manage carrier allocation (e.g., between the internal network and third-party carriers) to optimize for speed, cost, and service. Levers such as cost offsets are used to influence volume allocation to maintain healthy balance. These adjustments impact both overall network volume performance as well as station-level attainment. Since the primary guidance signal is aggregated at the network level, the station-specific impact manifests in the attainment percentage. Stations react differently to these levers based on their unique topology, geographic service areas, and third-party carrier presence. This complex interplay highlighted the need for an advanced attainment forecast model capable of incorporating these dynamic factors.

The Custom Attainment Forecast Model

To address the limitations of existing methods, the custom attainment forecast model was developed. The primary goal was to create a more sophisticated, feature-rich model capable of understanding and predicting the impact of factors like cost offsets on station-level attainment. Unlike the standard univariate time-series models, this new model needed to ingest 'known covariates' – specifically, historical and planned cost offset data provided by transportation partners.

Several modeling approaches were considered, including regression and tree-based methods. Ultimately, an automated machine learning (AutoML) library was selected due to its strong performance, scalability, relatively quick development time, user-friendly API, and available support. This choice allowed the team to leverage a powerful open-source library, focusing efforts on feature engineering and data quality rather than building complex models from scratch, leading to better performance gains.

The new model was designed to meet several key requirements:

Time-series based: Ensuring forecasts rely on recent actuals.
Standardized Input: Using official actual attainment data from the standard planning system.
Feature Ingestion: Incorporating known covariates (cost offsets) and categorical features (like region).
Stability & Explainability: Providing consistent and understandable outputs.
Responsiveness: Adapting quickly to recent data trends.
Efficiency: Enabling timely weekly forecast generation.
Flexibility & Control: Allowing planner overrides and providing quantile-based outputs for adjustments.
Performance Tracking: Mechanisms to monitor accuracy and variance.
Maintainability: Ongoing support and adaptability for updates.

The model has undergone continuous enhancements, adapting to flexible data cut-offs, new negative offset ranges, changes in offset application (e.g., geographic differentiation), incorporating advanced neural network models, and extending the forecast horizon.

Model Design and Architecture

The custom model operates within a cloud environment using cloud storage for data and a machine learning platform for computation. The overall architecture follows a structured data flow. Inputs are sourced from cloud storage, processed within the machine learning environment, and the resulting forecasts and logs are written back to cloud storage.

Key inputs stored in cloud storage buckets include static files, like region mappings that add geographic context, and dynamic files updated regularly. The core dynamic inputs are the actual attainment data (pulled from operational performance dashboards), a historical cost offset ledger (mapping past offsets to attainment data), the forward-looking cost offset plan provided by transportation partners, station-level statistics tables, and lists of network-wide outlier dates. These varied inputs provide the raw data needed for feature creation and model training, allowing the model to learn historical relationships and project future impacts based on planned operational levers like cost offsets.

Data Preprocessing and Training

Within the machine learning environment, the process begins with significant data preprocessing. A dedicated function combines the various input files (actual attainment, offsets, region mapping, etc.) and augments the dataset with explicit time-series features (like week number, day of week), geographic information, and relevant offset data columns.

Following feature creation, a crucial data cleaning and outlier handling step occurs. This involves removing stations with insufficient historical data (e.g., newly launched or closed stations) to ensure consistent time-series shapes required by the model. Outlier handling uses multiple methods. Network-wide outliers, often occurring around holidays or specific event dates, are managed using a maintained list of known outlier dates. Station-specific outliers, potentially caused by local operational disruptions, are identified statistically using historical means and standard deviations calculated over a long period (e.g., 70+ weeks from the station stats tables). Identified outliers are not simply removed but are imputed using values derived from preceding weeks for the same day-of-week, handled by a specific outlier imputation function. This ensures data continuity and prevents the model from learning from erroneous points. Metadata from this preprocessing (like imputed values or removed stations) is logged for auditing purposes.

Once the data is cleaned and prepared, the model training process involves two distinct stages, often executed sequentially. First, a back-test is performed to evaluate performance on the most recent week. The input dataset is temporally split (e.g., using a "W-2 Cutoff") to mimic the data available at the previous week's planning cycle. The model is retrained on this subset, and its prediction for the target week is compared against actuals and other benchmarks (like the previous forecast and the baseline model) using error metrics. This provides a reliable measure of the model's effectiveness under realistic conditions.

Second, the final forecast is generated using the complete, up-to-date dataset (the "Full-dataset"). The model is trained on all available historical data and predicts attainment for the required future horizon. Outputs from both stages, including the generated forecasts (potentially from different model types like an ensemble 'default' and a 'DirectTabular' variant) and detailed logs (input metadata, backtest performance, final model weights/validation scores), are stored back into designated cloud storage buckets for transparency, downstream use, and analysis.

Initially, an ensemble model (the AutoML library's default) was the primary output, which performs well under stable offset conditions. However, observing that this ensemble could overweight pure time-series models less sensitive to offsets, a second model, often a gradient boosting machine (GBM) referred to as 'DirectTabular', was introduced. This second model type demonstrates better responsiveness to significant week-over-week changes in offset plans, which became more common. Providing both allows for selecting the most appropriate forecast based on current conditions.

Program Management and Validation

The successful implementation of the custom model extends beyond the algorithm itself, involving rigorous program management. The model is run weekly with the latest data and offset plans. A critical focus is maintaining input data quality ("Garbage-in, Garbage-out"). Managing the offset information, often tracked semi-manually via collaboration documents and subject to offline updates or delays, requires diligence. Similarly, ensuring the accuracy of the actual attainment data feed from data dashboards is vital, with validation steps like the weekly back-test proving crucial for catching data pipeline issues. Detailed logs and notes are maintained for transparency and tracking.

Output validation is equally important. Each week, subject matter experts review the outputs from both model types, comparing them against recent actuals and considering contextual factors like overall demand trends or known operational events. Based on whether offset conditions are stable or volatile, and considering recent performance nuances (e.g., elevated attainment due to unexpectedly light demand), a recommendation is made regarding which model and which quantile (e.g., mean/P50, P40) to use for the official plan. This human-in-the-loop approach allows for nuanced adjustments that a fully automated system might miss.

The chosen forecast is then formatted via scripts for planner consumption, facilitating the application of potentially complex, week-specific overrides across numerous stations, reducing manual effort and error risk. Notably, the custom forecast is provided alongside the default forecast from the standard methodology, rather than replacing it directly in the system. This preserves the baseline forecast as a consistent benchmark for ongoing performance evaluation. The pipeline is intentionally not fully automated to mitigate risks associated with potential input data issues and allow for expert judgment.

Supporting Analytics

A key challenge with machine learning models is their "black box" nature, which can hinder trust and make troubleshooting difficult. To address this, supplemental tools were developed to enhance explainability and support deep dives. A dedicated analysis tool provides dense, station-level visualizations, plotting many weeks of historical attainment against cost offsets, overlaid with the current custom forecast (including multiple quantiles). This allows planners to visually validate if the forecast aligns with historical patterns under similar offset conditions and assess trends like year-over-year changes. It also integrates performance calculations, comparing the custom model against the baseline.

Results and Impact

Since its rollout, the custom attainment forecast model has delivered measurable improvements in forecast accuracy. Compared to the baseline model from the standard methodology, the custom model showed significant improvement in average forecast error metrics (specifically, Mean Absolute Percentage Error or MAPE), reducing W-3 MAPE by 48 basis points (e.g., 3.17% vs 3.65% for the baseline) and W-1 MAPE by 20 basis points (e.g., 3.18% vs 3.38% for the baseline). This enhanced accuracy is primarily attributed to the model's ability to correctly capture the station-level impact of cost offset changes. Beyond accuracy gains, the model and associated tooling have streamlined the planning process, reducing manual effort required for applying overrides through the flexible quantile adjustment system. This advanced model represents a significant advancement in providing more accurate, responsive, and insightful attainment forecasts for network planning.