A. Problem 1- Wholesale Customers Analysis (Wholesale Customers Data.csv)
Problem Statement:
A wholesale distributor operating in different regions of Portugal has information on annual spending of several items in their stores across different regions and channels. The data consists of 440 large retailers’ annual spending on 6 different varieties of products in 3 different regions (Lisbon, Oporto, Other) and across different sales channel (Hotel, Retail).
1.1 Methods of descriptive statistics is used to summarize data.
1.2 There are 6 different varieties of items that are considered. all the varieties across Region and Channel are described.
1.3 Extreme Behaviours are measured on the basis of a descriptive measure of variability.
1.4 Checking Outliers and treating them using different Outlier Treatments.
1.5 Recommendations for business after thorough business analysis.
B. Problem 2 - Survey-1 Dataset (Survey-1.csv)
The Student News Service at Clear Mountain State University (CMSU) has decided to gather data about the undergraduate students that attend CMSU. CMSU creates and distributes a survey of 14 questions and receives responses from 62 undergraduates (stored in the Survey data set).
2.1. For this data, following contingency tables are created: (Keeping Gender as row variable)
2.1.1. Gender and Major
2.1.2. Gender and Grad Intention
2.1.3. Gender and Employment
2.1.4. Gender and Computer
2.2. Assume that the sample is representative of the population of CMSU. Based on the data, the following questions are answered:
2.2.1. What is the probability that a randomly selected CMSU student will be male?
2.2.2. What is the probability that a randomly selected CMSU student will be female?
2.3. Assume that the sample is representative of the population of CMSU. Based on the data, answer the following question:
2.3.1. Find the conditional probability of different majors among the male students in CMSU.
2.3.2 Find the conditional probability of different majors among the female students of CMSU.
2.4. Assume that the sample is a representative of the population of CMSU. Based on the data, answer the following question:
2.4.1. Find the probability That a randomly chosen student is a male and intends to graduate.
2.4.2 Find the probability that a randomly selected student is a female and does NOT have a laptop.
2.5. Assume that the sample is representative of the population of CMSU. Based on the data, answer the following question:
2.5.1. Find the probability that a randomly chosen student is a male or has full-time employment?
2.5.2. Find the conditional probability that given a female student is randomly chosen, she is majoring in international business or management.
2.6. Construct a contingency table of Gender and Intent to Graduate at 2 levels (Yes/No). The Undecided students are not considered now and the table is a 2x2 table. Do you think the graduate intention and being female are independent events?
2.7. Note that there are four numerical (continuous) variables in the data set, GPA, Salary, Spending, and Text Messages.
Answer the following questions based on the data
2.7.1. If a student is chosen randomly, what is the probability that his/her GPA is less than 3?
2.7.2. Find the conditional probability that a randomly selected male earns 50 or more. Find the conditional probability that a randomly selected female earns 50 or more.
2.8. Note that there are four numerical (continuous) variables in the data set, GPA, Salary, Spending, and Text Messages. For each of them comment whether they follow a normal distribution. Write a note summarizing your conclusions.
Problem 3 A & B Shingles
An important quality characteristic used by the manufacturers of ABC asphalt shingles is the amount of moisture the shingles contain when they are packaged. Customers may feel that they have purchased a product lacking in quality if they find moisture and wet shingles inside the packaging. In some cases, excessive moisture can cause the granules attached to the shingles for texture and coloring purposes to fall off the shingles resulting in appearance problems. To monitor the amount of moisture present, the company conducts moisture tests. A shingle is weighed and then dried. The shingle is then reweighed, and based on the amount of moisture taken out of the product, the pounds of moisture per 100 square feet are calculated. The company would like to show that the mean moisture content is less than 0.35 pounds per 100 square feet.
The file (A & B shingles.csv) includes 36 measurements (in pounds per 100 square feet) for A shingles and 31 for B shingles.
3.1 Do you think there is evidence that means moisture contents in both types of shingles are within the permissible limits? State your conclusions clearly showing all steps.
3.2 Do you think that the population mean for shingles A and B are equal? Form the hypothesis and conduct the test of the hypothesis. What assumption do you need to check before the test for equality of means is performed?