This project leveraged machine learning and bootstrapping to identify an optimal region among three options for fictional energy company OilyGiantβs expansion, focusing on maximizing profit and minimizing risk. Using a linear regression model and a dataset of 100,000 data points, Region 2 emerged as the best choice, with an average potential profit exceeding $4 million, a 95% confidence interval predicting positive returns, and only a 1.8% risk of loss. These findings provide a data-driven framework for OilyGiant to allocate resources effectively and maximize profitability.
π Python and sklearn π€ Machine Learning and Cross-Validation π©π½βπ» Data Collection and Labelling π° Business Metrics: Calculating Revenue, Operating Profit, Margin, and Return on Investment π Statistical Methods: Bootstrapping and Confidence Intervals πΏ Data Sources
- This project uses pandas, numpy, train_test_split, StandardScaler, shuffle, LinearRegression, accuracy_score, mean_squared_error, and matplotlib.pyplot. It requires python 3.11.