这这一课程中，我们将学习数据挖掘的基本概念及其基础的方法和应用，然后深入到数据挖掘的子领域——模式发现中，学习模式发现深入的概念、方法，及应用。我们也将介绍基于模式进行分类的方法以及一些模式发现有趣的应用。这一课程将给你提供学习技能和实践的机会，将可扩展的模式发现方法应用在在大体量交易数据上，讨论模式评估指标，以及学习用于挖掘各类不同的模式、序列模式，以及子图模式的方法。

Loading...

来自 伊利诺伊大学香槟分校 的课程

Pattern Discovery in Data Mining

119 评分

这这一课程中，我们将学习数据挖掘的基本概念及其基础的方法和应用，然后深入到数据挖掘的子领域——模式发现中，学习模式发现深入的概念、方法，及应用。我们也将介绍基于模式进行分类的方法以及一些模式发现有趣的应用。这一课程将给你提供学习技能和实践的机会，将可扩展的模式发现方法应用在在大体量交易数据上，讨论模式评估指标，以及学习用于挖掘各类不同的模式、序列模式，以及子图模式的方法。

从本节课中

Module 2

Module 2 covers two lessons: Lessons 3 and 4. In Lesson 3, we discuss pattern evaluation and learn what kind of interesting measures should be used in pattern analysis. We show that the support-confidence framework is inadequate for pattern evaluation, and even the popularly used lift and chi-square measures may not be good under certain situations. We introduce the concept of null-invariance and introduce a new null-invariant measure for pattern evaluation. In Lesson 4, we examine the issues on mining a diverse spectrum of patterns. We learn the concepts of and mining methods for multiple-level associations, multi-dimensional associations, quantitative associations, negative correlations, compressed patterns, and redundancy-aware patterns.

- Jiawei HanAbel Bliss Professor

Department of Computer Science

Now we come down to another interesting issue,

it's mining multi-dimensional association rules.

Sometimes the items or the things

we want to mine are actually sitting on the multi-dimensions.

The single dimension rule we studied,

for example, X buys milk,

then X buys bread.

Milk and bread, both are from the same kind of dimension

called product but sometimes we may get a multi-dimensional association rules.

For example, for this inter-dimensional association rule,

like if X age is 18 to 25 and

if X occupation is a student then likely X going to buy a Coke.

That was actually H, occupation,

and items to buy are three different dimensions.

That kind of rule we call inter-dimensional association rule.

Then even we look at this sometimes we see the Hybrid cases,

like if X age is 18 to 25,

if X buys popcorn likely X is going to buy coke.

That kind of rule you have H as

one dimension but these two buys are seeing the same damages.

Okay. So if you see this kind of cases,

they likely these three different cases may slightly need different kind of algorithms.

Another thing is that attributes can be categorical, can be numerical.

For example, the categorical case could be product like

beer and diapers and you may have a profession like a student or a professor.

So, for these categorical attributes

we can for multi-dimensional, inter-dimensional association rules,

we can construct a data cube to mine

such things or just use data cube structure to mine such association efficiently.

Then, another case could be

quantitive attributes or we can say those attributes are numerically.

For example, H [inaudible] or some other things.

For this case it's quite often we may use this criterization method, clustering method,

some kind of gradient approaches we are going to discuss this part

in the quantitative association rule of mining. Thank you.