June 17, 2025
Coursera, IBM
Fall 2023
- Data Analysis & Feature Engineering: Conducted extensive exploratory data analysis (EDA) to uncover trends, outliers, and correlations in customer usage, service bundling, and revenue across countries. Identified key variables driving subscription behavior and revenue performance.
- Model Development & Evaluation: Designed and compared multiple machine learning models—including Random Forest, Support Vector Machine, and Gaussian Process regressors—to predict future revenue. Iteratively tuned hyperparameters and evaluated performance using standard metrics.
- API Development & ML Ops: Built a modular FastAPI microservice to wrap trained models, supporting endpoints for training, forecasting, performance evaluation, and monitoring. Followed test-driven development (TDD) to ensure robustness.
- Containerization & Deployment: Containerized the entire pipeline (data, model, API, and tests) using Docker for reproducibility and portability. Designed the architecture for scalable deployment in production environments.
- Business Impact Analysis: Created post-production monitoring scripts to evaluate model drift and ensure alignment with key business KPIs like forecast accuracy and service usage patterns.
The Capstone of the IBM AI Enterprise Workflow specialization is based on the business problem of a fictional company named AAVIL. The case study will encompass all of the stages of the AI enterprise workflow. At the end of the design thinking process, the end user statement was:
"Our customers will be able to subscribe and pay for AAVAIL services by combining any services they desire and being charged only for those services."
The capstone was composed of three parts:
1. Data Investigation: The process involves understanding the business scenario, identifying the required data and creating a script to extract relevant data from various sources. Then, the data is analyzed to investigate the relationship between the relevant data, the target and the business metric, and finally, the findings are communicated using visualizations.
2. Model Iteration: To address the business opportunity, different modeling approaches are stated and compared. The models are iteratively modified and trained on the selected approach using all the data, and finally, the findings are summarized in a report.
3. Model Production: To put the model in production, build a draft version of an API with train, predict, and logfile endpoints, bundle the API, model, and unit tests using Docker, and iteratively improve the API using test-driven development. Additionally, a post-production analysis script is created to investigate the relationship between model performance and the business metric, and finally, the findings are summarized in a final report.
1. Data Investigation: Results are in the EDA notebook, showcasing things such as the revenue time sreies per country and variable correlations.
2. Model Iteration: Various different models are explored, such as a random forest, support vector machine and gaussian
3. Model Production: Model is wrapped in FastAPI endpoints with 5 endpoints:
- POST /api/vi/model/train/: train a series of models.
- GET /api/v1/model/train/: run performance metrics on a trained model.
- GET /api/v1/model/forecast_date/: Forecast revenue 30 days past a chosen date given a trained model.
- GET /api/v1/model/forecast_range/: Forecast revenue for a range of dates 30 days past a chosen date range given a trained model.
- GET /api/v1/model/monitor/: Run monitoring performance metrics given current training data.
Delivered a production-ready forecasting model with five fully functional FastAPI endpoints, enabling real-time revenue prediction, performance monitoring, and business metric analysis across global markets.