Contact Information and Shareholder Assistance. Sales insights: Walmart dataset is the real-world data and from this one can learn about sales forecasting and analysis. I then drop all other events, keeping only the wasted label. By whitelisting SlideShare on your ad-blocker, you are supporting our community of content creators. Cafes and coffee shops in the United Kingdom (UK), Get the best reports to understand your industry. liability for the information given being complete or correct. We receive millions of visits per year, have several thousands of followers across social media, and thousands of subscribers. We will discuss this at the end of this blog. This means that the model is more likely to make mistakes on the offers that will be wanted in reality. It will be very helpful to increase my model accuracy to be above 85%. Information related to Starbucks: It is an American coffee company and was started Seattle, Washington in 1971. Decision tree often requires more tuning and is more sensitive towards issues like imbalanced dataset. Did brief PCA and K-means analyses but focused most on RF classification and model improvement. Starbucks purchases Peet's: 1984. Starbucks expands beyond Seattle: 1987. Statista assumes no If an offer is really hard, level 20, a customer is much less likely to work towards it. Here are the five business questions I would like to address by the end of the analysis. They are the people who skipped the offer viewed. Environmental, Social, Governance | Starbucks Resources Hub. Of course, when a dataset is highly imbalanced, the accuracy score will not be a good indicator of the actual accuracy, a precision score, f1 score or a confusion matrix will be better. Not all users receive the same offer, and that is the challenge to solve with this dataset. Third Attempt: I made another attempt at doing the same but with amount_invalid removed from the dataframe. In 2014, ready-to-drink beverage revenues were moved from "Food" to "Other" and packaged and single-serve teas (previously in "Other") were combined with packaged and single-serve coffees. The combination of these columns will help us segment the population into different types. This website uses cookies to improve your experience while you navigate through the website. In addition, that column was a dictionary object. 2021 Starbucks Corporation. The whole analysis is provided in the notebook. Discover historical prices for SBUX stock on Yahoo Finance. Available: https://www.statista.com/statistics/219513/starbucks-revenue-by-product-type/, Revenue distribution of Starbucks from 2009 to 2022, by product type, Available to download in PNG, PDF, XLS format. Every data tells a story! or they use the offer without notice it? PCA and Kmeans analyses are similar. Activate your 30 day free trialto continue reading. Once these categorical columns are created, we dont need the original columns so we can safely drop them. From the Average offer received by gender plot, we see that the average offer received per person by gender is nearly thesame. Get in touch with us. Access to this and all other statistics on 80,000 topics from, Show sources information Brazilian Trade Ministry data showed coffee exports fell 45% in February, and broker HedgePoint cut its projection for Brazil's 2023/24 arabica coffee production to 42.3 million bags from 45.4 million. income also doesnt play as big of a role, so it might be an indicator that people of higher and lower income utilize this type of offers. A link to part 2 of this blog can be foundhere. You can sign up for additional subscriptions at any time. The Reward Program is available on mobile devices as the Starbucks app, and has seen impressive membership and growth since 2008, with multiple iterations on its original form. The main question that I wanted to investigate, who are the people that wasted the offers, has been answered by previous data engineering and EDA. Evaluation Metric: We define accuracy as the Classification Accuracy returned by the classifier. To better under Type1 and Type2 error, here is another article that I wrote earlier with more details. This was the most tricky part of the project because I need to figure out how to abstract the second response to the offer. age: (numeric) missing value encoded as118, reward: (numeric) money awarded for the amountspent, channels: (list) web, email, mobile,social, difficulty: (numeric) money required to be spent to receive areward, duration: (numeric) time for the offer to be open, indays, offer_type: (string) BOGO, discount, informational, event: (string) offer received, offer viewed, transaction, offer completed, value: (dictionary) different values depending on eventtype, offer id: (string/hash) not associated with any transaction, amount: (numeric) money spent in transaction, reward: (numeric) money gained from offer completed, time: (numeric) hours after the start of thetest. Dataset with 108 projects 1 file 1 table. Overview and forecasts on trending topics, Industry and market insights and forecasts, Key figures and rankings about companies and products, Consumer and brand insights and preferences in various industries, Detailed information about political and social topics, All key figures about countries and regions, Market forecast and expert KPIs for 600+ segments in 150+ countries, Insights on consumer attitudes and behavior worldwide, Business information on 60m+ public and private companies, Detailed information for 35,000+ online stores and marketplaces. Lets look at the next question. KEFU ZHU To a smaller extent, higher age and income is associated with the M gender and lower age and income with the F and O genders. These come in handy when we want to analyze the three offers seperately. Database Project for Starbucks (SQL) May. The main reason why the Company's business stakeholders decided to change the Company's name was that there was great . Figures have been rounded. In order for Towards AI to work properly, we log user data. Unbeknown to many, Starbucks has invested significantly in big data and analytics capabilities in order to determine the potential success of its stores and products, and grow sales. Company reviews. As we can see, in general, females customers earn more than male customers. We will get rid of this because the population of 118 year-olds is not insignificant in our dataset. For BOGO and discount offers, we want to identify people who used them without knowing it, so that we are not giving money for no gains. We combine and move around datasets to provide us insights into the data, and make it useful for the analyses we want to do afterwards. Answer: For both offers, men have a significantly lower chance of completing it. The reason is that we dont have too many features in the dataset. Submission for the Udacity Capstone challenge. I. However, for information-type offers, we need to take into account the offer validity. Using Polynomial Features: To see if the model improves, I implemented a polynomial features pipeline with StandardScalar(). 754. Updated 3 years ago Starbucks location data can be used to find location intelligence on the expansion plans of the coffeehouse chain BOGO: For the buy-one-get-one offer, we need to buy one product to get a product equal to the threshold value. You can analyze all relevant customer data and develop focused customer retention programs Content the mobile app sends out an offer and/or informational material to its customer such as discounts (%), BOGO Buy one get one free, and informational . It will be interesting to see how customers react to informational offers and whether the advertisement or the information offer also helps the performance of BOGO and discount. In this case, using SMOTE or upsampling can cause the problem of overfitting our dataset. Built for multiple linear regression and multivariate analysis, the Fish Market Dataset contains information about common fish species in market sales. At present CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. Enjoy access to millions of ebooks, audiobooks, magazines, and more from Scribd. Then you can access your favorite statistics via the star in the header. A Medium publication sharing concepts, ideas and codes. Can and will be cliquey across all stores, managers join in too . Gender does influence how much a person spends at Starbucks. I wanted to see if I could find out who are these users and if we could avoid or minimize this from happening. However, it is worth noticing that BOGO offer has a much greater chance to be viewed or seen by customers. Although, BOGO and Discount offers were distributed evenly. After submitting your information, you will receive an email. To use individual functions (e.g., mark statistics as favourites, set There are three types of offers: BOGO ( buy one get one ), discount, and informational. Are you interested in testing our business solutions? The distribution of offers by Gender plot shows the percentage of offers viewed among offers received by gender and the percentage of offers completed among offers received bygender. These cookies will be stored in your browser only with your consent. ", Starbucks, Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars) Statista, https://www.statista.com/statistics/219513/starbucks-revenue-by-product-type/ (last visited March 01, 2023), Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars) [Graph], Starbucks, November 18, 2022. The reason is that the business costs associate with False Positive and False Negative might be different. There are two ways to approach this. This means that the company This the primary distinction represented by PC0. calories Calories. It also appears that there are not one or two significant factors only. portfolio.json containing offer ids and meta data about each offer (duration, type, etc. Interactive chart of historical daily coffee prices back to 1969. Performed an exploratory data analysis on the datasets. The completion rate is 78% among those who viewed the offer. PC4: primarily represents age and income. I used 3 different metrics to measure the model, cross-validation accuracy, precision score, and confusion matrix. You only have access to basic statistics. Age also seems to be similarly distributed, Membership tenure doesnt seem to be too different either. Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet. The goal of this project is to combine transaction, demographic, and offer data to determine which demographic groups respond best to which offer type. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. Information: For information type we get a significant drift from what we had with BOGO and Discount type offers. View daily, weekly or monthly format back to when Starbucks Corporation stock was issued. precise. An in-depth look at Starbucks sales data! This dataset contains about 300,000+ stimulated transactions. You can read the details below. DecisionTreeClassifier trained on 9829 samples. In this capstone project, I was free to analyze the data in my way. Number of Starbucks stores in the U.S. 2005-2022, American Customer Satisfaction Index: Starbucks in the U.S. 2006-2022, Market value of the coffee shop industry in the U.S. 2018-2022. What are the main drivers of an effective offer? If youre struggling with your assignments like me, check out www.HelpWriting.net . The transcript.json data has the transaction details of the 17000 unique people. Due to varying update cycles, statistics can display more up-to-date To repeat, the business question I wanted to address was to investigate the phenomenon in which users used our offers without viewing it. Click to reveal From the portfolio.json file, I found out that there are 10 offers of 3 different types: BOGO, Discount, Informational. In this project, the given dataset contains simulated data that mimics customer behavior on the Starbucks rewards mobile app. Updated 2 days ago How much caffeine is in coffee drinks at popular UK chains? The channel column was tricky because each cell was a list of objects. Top open data topics. Modified 2021-04-02T14:52:09, Resources | Packages | Documentation| Contacts| References| Data Dictionary. As a whole, 2017 and 2018 can be looked as successful years. Here's What Investors Should Know. We evaluate the accuracy based on correct classification. To improve the model, I downsampled the majority label and balanced the dataset. Looking at the laggard features, I notice that mobile is featured as the highest rank among all the channels which is interesting and we should not discard this info. So classification accuracy should improve with more data available. Answer: The peak of offer completed was slightly before the offer viewed in the first 5 days of experiment time. US Coffee Statistics. Coffee shop and cafe industry in the U.S. Quick service restaurant brands: Starbucks. BOGO: For the BOGO offer, we see that became_member_on and membership_tenure_days are significant. I want to know how different combos impact each offer differently. Download Dataset Top 10 States with the most Starbucks stores California 3,055 (19%) A store for every 12,934 people, in California with about 19% of the total number of Starbucks stores Texas 1,329 (8%) A store for every 21,818 people, in Texas with about 8% of the total number of Starbucks stores Florida 829 (5%) With over 35 thousand Starbucks stores worldwide in 2022, the company has established itself as one of the world's leading coffeehouse chains. Elasticity exercise points 100 in this project, you are asked. k-mean performance improves as clusters are increased. Importing Libraries http://s3.amazonaws.com/radius.civicknowledge.com/chrismeller.github.com-starbucks-2.1.1.csv, https://github.com/metatab-packages/chrismeller.github.com-starbucks.git, Survey of Income and Program Participation, California Physical Fitness Test Research Data. of our customers during data exploration. We've encountered a problem, please try again. In summary, I have walked you through how I processed the data to merge the 3 datasets so that I could do data analysis. Once everything is inside a single dataframe (i.e. Most of the offers as we see, were delivered via email and the mobile app. . profile.json . Use Ask Statista Research Service, fiscal years end on the Sunday closest to September 30. Download Historical Data. the original README: This dataset release re-geocodes all of the addresses, for the us_starbucks Currently, you are using a shared account. The re-geocoded . Learn more about how Statista can support your business. All of our articles are from their respective authors and may not reflect the views of Towards AI Co., its editors, or its other writers. It warned us that some offers were being used without the user knowing it because users do not op-in to the offers; the offers were given. Another reason is linked to the first reason, it is about the scope. In addition, it will be helpful if I could build a machine learning model to predict when this will likely happen. Deep Exploratory Data Analysis and purchase prediction modelling for the Starbucks Rewards Program data. Jul 2015 - Dec 20172 years 6 months. The company also logged 5% global comparable-store sales growth. Some users might not receive any offers during certain weeks. If you are an admin, please authenticate by logging in again. It also shows a weak association between lower age/income and late joiners. But opting out of some of these cookies may affect your browsing experience. Performance & security by Cloudflare. You must click the link in the email to activate your subscription. Statista. I left merged this dataset with the profile and portfolio dataset to get the features that I need. Medical insurance costs. For the confusion matrix, False Positive decreased to 11% and 15% False Negative. Income is also as significant as age. Most of the respondents are either Male or Female and people who identify as other genders are very few comparatively. Find jobs. Instantly Purchasable Datasets DoorDash Restaurants List $895.00 View Dataset 5.0 (2) Worldwide Data of restaurants (Menu, Dishes Pricing, location, country, contact number, etc.) Mobile users are more likely to respond to offers. The assumption being that this may slightly improve the models. This is a slight improvement on the previous attempts. 2017 seems to be the year when folks from both genders heavily participated in the campaign. Through our unwavering commitment to excellence and our guiding principles, we bring the uniqueStarbucks Experienceto life for every customer through every cup. I found a data set on Starbucks coffee, and got really excited. age for instance, has a very high score too. Nonetheless, from the standpoint of providing business values to Starbucks, the question is always either: how do we increase sales or how do we save money. 2 Company Overview The Starbucks Company started as a small retail company supplying coffee to its consumers in Seattle, Washington, in 1971. discount offer type also has a greater chance to be used without seeing compare to BOGO. If youre not familiar with the concept. During that same year, Starbucks' total assets. Starbucks Corporation - Financial Data - Supplemental Financial Data Investor Relations > Financial Data > Supplemental Financial Data Financial Data Supplemental Financial Data The information contained on this page is updated as appropriate; timeframes are noted within each document. Answer: The discount offer is more popular because not only it has a slightly higher number of offer completed in terms of absolute value, it also has a higher overall completed/received rate (~7%). However, for each type of offer, the offer duration, difficulties or promotional channels may vary. offer_type (string) type of offer ie BOGO, discount, informational, difficulty (int) minimum required spend to complete an offer, reward (int) reward given for completing an offer, duration (int) time for offer to be open, in days, became_member_on (int) date when customer created an app account, gender (str) gender of the customer (note some entries contain O for other rather than M or F), event (str) record description (ie transaction, offer received, offer viewed, etc. Market & Alternative Datasets; . Informational: This type of offer has no discount or minimum amount tospend. Dollars per pound. This dataset is a simplified version of the real Starbucks app because the underlying simulator only has one product whereas Starbucks sells dozens of products. Refresh the page, check Medium 's site status, or find something interesting to read. Urls used in the creation of this data package. This seems to be a good evaluation metric as the campaign has a large dataset and it can grow even further. There are three main questions I attempted toanswer. Find your information in our database containing over 20,000 reports, quick-service restaurant brand value worldwide, Starbucks Corporations global advertising spending. RUIBING JI While Men tend to have more purchases, Women tend to make more expensive purchases. Free access to premium services like Tuneln, Mubi and more. Through this, Starbucks can see what specific people are ordering and adjust offerings accordingly. to incorporate the statistic into your presentation at any time. promote the offer via at least 3 channels to increase exposure. profile.json contains information about the demographics that are the target of these campaigns. The SlideShare family just got bigger. To observe the purchase decision of people based on different promotional offers. Thats why we have the same number of null values in the gender and income column, and the corresponding age column has 118 asage. DATABASE PROJECT Q5: Which type of offer is more likely to be used WITHOUT being viewed, if there is one? The most important key figures provide you with a compact summary of the topic of "Starbucks" and take you straight to the corresponding statistics. The dataset contains simulated data that mimics customers' behavior after they received Starbucks offers. DecisionTreeClassifier trained on 10179 samples. October 28, 2021 4 min read. How to Ace Data Science Interview by Working on Portfolio Projects. Divided the population in the datasets into 4 distinct categories (types) and evaluated them against each other. Here we can notice that women in this dataset have higher incomes than men do. Prime cost (cost of goods sold + labor cost) is generally the most reliable data that's initially tied to restaurant profitability as it can represent more than 60% of every sale in expenses. You also have the option to opt-out of these cookies. Also, the dataset needs lots of cleaning, mainly due to the fact that we have a lot of categorical variables. Overview and forecasts on trending topics, Industry and market insights and forecasts, Key figures and rankings about companies and products, Consumer and brand insights and preferences in various industries, Detailed information about political and social topics, All key figures about countries and regions, Market forecast and expert KPIs for 600+ segments in 150+ countries, Insights on consumer attitudes and behavior worldwide, Business information on 60m+ public and private companies, Detailed information for 35,000+ online stores and marketplaces. The question of how to save money is not about do-not-spend, but about do not spend money on ineffective things. To receive notifications via email, enter your email address and select at least one subscription below. Preprocessed the data to ensure it was appropriate for the predictive algorithms. Revenue of $8.7 billion and adjusted . Meanwhile, those people who achieved it are likely to achieve that amount of spending regardless of the offer. We are happy to help. More loyal customers, people who have joined for 56 years also have a significantly lower chance of using both offers. Your IP: You must click the link in the email to activate your subscription. The reason is that demographic does not make a difference but the design of the offer does. For future studies, there is still a lot that can be done. Once every few days, Starbucks sends out an offer to users of the mobile app. Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars) [Graph]. Therefore, I stick with the confusion matrix. Sales & marketing day 4 [class of 5th jan 2020], Retail for Business Analysts and Management Consultants, Keeping it Real with Dashboards in The Financial Edge. All about machines, humans, and the links between them. Perhaps, more data is required to get a better model. Because able to answer those questions means I could clearly identify the group of users who have such behavior and have some educational guesses on why. To answer the first question: What is the spending pattern based on offer type and demographics? In our Data Analysis, we answered the three questions that we set out to explore with the Starbucks Transactions dataset. Discount: In this offer, a user needs to spend a certain amount to get a discount. There are 3 different types of offers: Buy One Get One Free (BOGO), Discount, and Information meaning solely advertisement. To get BOGO and Discount offers is also not a very difficult task. Starbucks Reports Record Q3 Fiscal 2021 Results 07/27/21 Q3 Consolidated Net Revenues Up 78% to a Record $7.5 Billion Q3 Comparable Store Sales Up 73% Globally; U.S. Up 83% with 10% Two-Year Growth Q3 GAAP EPS $0.97; Record Non-GAAP EPS of $1.01 Driven by Strong U.S. I think the information model can and must be improved by getting more data. PC1 -- PC4 also account for the variance in data whereas PC5 is negligible. For model choice, I was deciding between using decision trees and logistic regression. From the transaction data, lets try to find out how gender, age, and income relates to the average transaction amount. Stock Market Predictions using Deep Learning, Data Analysis Project with PandasStep-by-Step Guide (Ted Talks Data), Bringing Your Story to Life: Creating Customized Animated Videos using Generative AI, Top 5 Data Science Projects From Beginners to Pros in Python, Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for2022, Descriptive Statistics for Data-driven Decision Making withPython, Best Machine Learning (ML) Books-Free and Paid-Editorial Recommendations for2022, Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for2022, Best Data Science Books-Free and Paid-Editorial Recommendations for2022, Mastering Derivatives for Machine Learning, We employed ChatGPT as an ML Engineer. DATA SOURCES 1. "Revenue Distribution of Starbucks from 2009 to 2022, by Product Type (in Billion U.S. Sep 8, 2022. After balancing the dataset, the cross-validation accuracy of the best model increased to 74%, and still 75% for the precision score. You need at least a Starter Account to use this feature. Your home for data science. I wanted to analyse the data based on calorie and caffeine content. Q4 Consolidated Net Revenues Up 31% to a Record $8.1 Billion. Q4: Which group of people is more likely to use the offer or make a purchase WITHOUT viewing the offer, if there is such a group? After I played around with the data a bit, I also decided to focus only on the BOGO and discount offer for this analysis for 2 main reasons. So my new dataset had the following columns: Also, I changed the null gender to Unknown to make it a newfeature. I will follow the CRISP-DM process. The goal of this project is to combine transaction, demographic, and offer data to determine which demographic groups respond best to which offer type. Chart. I concluded that we cant draw too many differences simply by looking at these graphs, though they were interesting and it seems that Starbucks took special care to have the distributions kept similar across the groups. Register in seconds and access exclusive features. The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".