0:11

Welcome back.

In this video we're going to look at the different types of analyses you can

perform once you have identified the business problem or opportunity,

developed a hypothesis and collected relevant data.

As processing capacity continues to increase, it has opened

the door to a broad range of advanced algorithms and modeling techniques.

That organizations can use to produce valuable insights from data.

0:48

>> Thanks Dan.

I'm Lorie Wijntjes, managing director in our data and

analytics practice with almost 30 years experience as a statistician.

At PWC, I have worked on a wide variety of business problems involving predictive

analytics, data management, statistical sampling, and survey design.

Some fairly straightforward, and many that have been complex,

across a wide range of business problems in industries including healthcare,

financial services, and retail and consumer.

In this video I will give you a high level overview of the different types of

analysis that you can perform on data.

As part of the course you will find supplemental reading

that covers each of these analysis types and how they are used.

Now, it's important to keep in mind that the analysis you choose to perform will

depend on a couple of things.

First, the problem you are trying to solve, and second,

the data you can use to solve that problem.

I'm going to start by talking about cluster analysis.

1:47

Cluster analysis is when you group a set of objects in a way

that objects in the same group or

cluster are more similar to one another than those in the other clusters.

Cluster analysis is often used in market research

when working with data from focus groups and surveys.

A cluster analysis can be used to segment a population of consumers

into market groups to better understand the relationships between different

groups of consumers.

This analysis can help answer questions such as, who are my target customers?

How are they differentiated on behavioral, psychographic and

demographic characteristics?

Are there groups that have similar attributes so that products,

services, price offerings, can be used to customize segments?

Now, let's move on to decision tree analysis.

A decision support tool that uses a tree-like graph of decisions and

their possible consequences.

Including chance event outcomes, resource cost, and utility.

Decision tree analysis is often used to assist healthcare practitioners

considering varying treatments along with each one's associated costs and

probability of a successful outcome.

For example, healthcare providers can use this analysis to assess options and

deliver more cost effective treatments that minimize the risk of hospital

readmission.

To analyze large numbers of dependent and

independent variables, we might use factor analysis.

This type of analysis can help detect

what aspects of the independent variables are related to the dependent variables.

When we receive the data, sets that are fairly wide,

meaning that they had more variables in observations or records.

We need a way to identify the core set of variables or

drivers that will help to gain meaningful insight.

Factor analysis can help identify that reduced subset of variables, meaning

some of those variables represent similar relationships as those not included, but

perhaps in a stronger way.

Machine learning is a type of artificial intelligence that provides computers with

the ability to learn without being explicitly programmed to do so.

For forward-thinking retailers, the possibilities for

advanced machine learning are limitless.

Take for example a company trying to predict

what customers will be buying next spring.

Machine learning algorithms can determine availability of materials

from outside vendors, incorporate various supply chain scenarios.

And recommend the quantity, price, shelf placement, and marketing

channel that would best reach the target consumer in a particular geographic area.

4:31

Regression analysis is a statistical process for estimating relationships

between a dependent variable and one or more independent variables.

Variables are the pieces of information.

This type of analysis helps you understand how the value of

a dependent variable changes when any one of the independent variables change.

For example, a large insurance company wants to identify the characteristics

including age group, income, gender, educational level, etc.

Of customers that tend to make the most automobile claims.

This type of analysis can be used to assess risk, and

also assist with determining pricing for various automobile insurance products.

Multivariate analysis is the observation and

analysis of more than one statistical outcome variable at a time.

This often includes as a first step correlation analysis, which can help you

understand and visualize relationships between pairs of variables.

Multivariate regression is a technique that estimates a single regression model

with more than one outcome variable.

When there is more than one predictor variable in a multivariate regression

model, the model is a multivariate multiple regression.

To understand the relationships of outcome effectiveness of a particular medical

treatment, one may also need to understand confounding variables.

Such as age, weight, gender, or other medications the patient may be receiving.

There may be multiple ways to assess outcome and thus,

more than one dependent variable and multiple independent variables.

6:44

Traditionally, product penetration was driven by the bank's relationship managers

and its branches.

Segmentation analysis could help the bank gain market share by identifying key

customer segments and developing product recommendations for

those that are more likely to use mobile banking.

Sentiment analysis is a process of identifying and

categorizing opinions expressed in a piece of text to determine whether the writer's

attitude towards a topic or issue is positive, negative, or neutral.

8:29

Time series analysis can be used to design a methodology to identify the factors

affecting airline passenger demand on routes by leveraging macroeconomic,

demographic, and other external data, at a local, state, and national level.

Such models can be developed to produce a route level forecast of total demand for

air travel.

Helping to optimize route and capacity planning and identify new routes for

market entry.

Time series analysis comprises methods for

analyzing data that are collected over time to extract meaningful statistics.

Stock prices, sales volumes, interest rates, and

quality measurements are all typical examples.

Because of this sequential nature of the data, special statistical techniques

accounting for the dynamic nature of the data are required.

Now, let's answer one last assessment question for this segment.