[SOUND] This lecture is about the word association mining and analysis. In this lecture, we're going to talk about how to mine associations of words from text. Now this is an example of knowledge about the natural language that we can mine from text data. Here's the outline. We're going to first talk about what is word association and then explain why discovering such relations is useful and finally we're going to talk about some general ideas about how to mine word associations. In general there are two word relations and these are quite basic. One is called a paradigmatic relation. The other is syntagmatic relation. A and B have paradigmatic relation if they can be substituted for each other. That means the two words that have paradigmatic relation would be in the same semantic class, or syntactic class. And we can in general replace one by the other without affecting the understanding of the sentence. That means we would still have a valid sentence. For example, cat and dog, these two words have a paradigmatic relation because they are in the same class of animal. And in general, if you replace cat with dog in a sentence, the sentence would still be a valid sentence that you can make sense of. Similarly Monday and Tuesday have paradigmatical relation. The second kind of relation is called syntagmatical relation. In this case, the two words that have this relation, can be combined with each other. So A and B have syntagmatic relation if they can be combined with each other in a sentence, that means these two words are semantically related. So for example, cat and sit are related because a cat can sit somewhere. Similarly, car and drive are related semantically and they can be combined with each other to convey meaning. However, in general, we can not replace cat with sit in a sentence or car with drive in the sentence to still get a valid sentence, meaning that if we do that, the sentence will become somewhat meaningless. So this is different from paradigmatic relation. And these two relations are in fact so fundamental that they can be generalized to capture basic relations between units in arbitrary sequences. And definitely they can be generalized to describe relations of any items in a language. So, A and B don't have to be words and they can be phrases, for example. And they can even be more complex phrases than just a non-phrase. If you think about the general problem of the sequence mining then we can think about the units being and the sequence data. Then we think of paradigmatic relation as relations that are applied to units that tend to occur in a singular locations in a sentence, or in a sequence of data elements in general. So they occur in similar locations relative to the neighbors in the sequence. Syntagmatical relation on the other hand is related to co-occurrent elements that tend to show up in the same sequence. So these two are complimentary and are basic relations of words. And we're interested in discovering them automatically from text data. Discovering such worded relations has many applications. First, such relations can be directly useful for improving accuracy of many NLP tasks, and this is because this is part of our knowledge about a language. So if you know these two words are synonyms, for example, and then you can help a lot of tasks. And grammar learning can be also done by using such techniques. Because if we can learn paradigmatic relations, then we form classes of words, syntactic classes for example. And if we learn syntagmatic relations, then we would be able to know the rules for putting together a larger expression based on component expressions. So we learn the structure and what can go with what else. Word relations can be also very useful for many applications in text retrieval and mining. For example, in search and text retrieval, we can use word associations to modify a query, and this can be used to introduce additional related words into a query and make the query more effective. It's often called a query expansion. Or you can use related words to suggest related queries to the user to explore the information space. Another application is to use word associations to automatically construct the top of the map for browsing. We can have words as nodes and associations as edges. A user could navigate from one word to another to find information in the information space. Finally, such word associations can also be used to compare and summarize opinions. For example, we might be interested in understanding positive and negative opinions about the iPhone 6. In order to do that, we can look at what words are most strongly associated with a feature word like battery in positive versus negative reviews. Such a syntagmatical relations would help us show the detailed opinions about the product. So, how can we discover such associations automatically? Now, here are some intuitions about how to do that. Now let's first look at the paradigmatic relation. Here we essentially can take advantage of similar context. So here you see some simple sentences about cat and dog. You can see they generally occur in similar context, and that after all is the definition of paradigmatic relation. On the right side you can kind of see I extracted expressly the context of cat and dog from this small sample of text data. I've taken away cat and dog from these sentences, so that you can see just the context. Now, of course we can have different perspectives to look at the context. For example, we can look at what words occur in the left part of this context. So we can call this left context. What words occur before we see cat or dog? So, you can see in this case, clearly dog and cat have similar left context. You generally say his cat or my cat and you say also, my dog and his dog. So that makes them similar in the left context. Similarly, if you look at the words that occur after cat and dog, which we can call right context, they are also very similar in this case. Of course, it's an extreme case, where you only see eats. And in general, you'll see many other words, of course, that can't follow cat and dog. You can also even look at the general context. And that might include all the words in the sentence or in sentences around this word. And even in the general context, you also see similarity between the two words. So this was just a suggestion that we can discover paradigmatic relation by looking at the similarity of context of words. So, for example, if we think about the following questions. How similar are context of cat and context of dog? In contrast how similar are context of cat and context of computer? Now, intuitively, we're to imagine the context of cat and the context of dog would be more similar than the context of cat and context of the computer. That means, in the first case the similarity value would be high, between the context of cat and dog, where as in the second, the similarity between context of cat and computer would be low because they all not having a paradigmatic relationship and imagine what words occur after computer in general. It would be very different from what words occur after cat. So this is the basic idea of what this covering, paradigmatic relation. What about the syntagmatic relation? Well, here we're going to explore the correlated occurrences, again based on the definition of syntagmatic relation. Here you see the same sample of text. But here we're interested in knowing what other words are correlated with the verb eats and what words can go with eats. And if you look at the right side of this slide and you see, I've taken away the two words around eats. I've taken away the word to its left and also the word to its right in each sentence. And then we ask the question, what words tend to occur to the left of eats? And what words tend to occur to the right of eats? Now thinking about this question would help us discover syntagmatic relations because syntagmatic relations essentially captures such correlations. So the important question to ask for syntagmatical relation is, whenever eats occurs, what other words also tend to occur? So the question here has to do with whether there are some other words that tend to co-occur together with each. Meaning that whenever you see eats you tend to see the other words. And if you don't see eats, probably, you don't see other words often either. So this intuition can help discover syntagmatic relations. Now again, consider example. How helpful is occurrence of eats for predicting occurrence of meat? Right. All right, so knowing whether eats occurs in a sentence would generally help us predict whether meat also occurs indeed. And if we see eats occur in the sentence, and that should increase the chance that meat would also occur. In contrast, if you look at the question in the bottom, how helpful is the occurrence of eats for predicting of occurrence of text? Because eats and text are not really related, so knowing whether eats occurred in the sentence doesn't really help us predict the weather, text also occurs in the sentence. So this is in contrast to the question about eats and meat. This also helps explain that intuition behind the methods of what discovering syntagmatic relations. Mainly we need to capture the correlation between the occurrences of two words. So to summarize the general ideas for discovering word associations are the following. For paradigmatic relation, we present each word by its context. And then compute its context similarity. We're going to assume the words that have high context similarity to have paradigmatic relation. For syntagmatic relation, we will count how many times two words occur together in a context, which can be a sentence, a paragraph, or a document even. And we're going to compare their co-occurrences with their individual occurrences. We're going to assume words with high co-occurrences but relatively low individual occurrences to have syntagmatic relations because they attempt to occur together and they don't usually occur alone. Note that the paradigmatic relation and the syntagmatic relation are actually closely related in that paradigmatically related words tend to have syntagmatic relation with the same word. They tend to be associated with the same word, and that suggests that we can also do join the discovery of the two relations. So these general ideas can be implemented in many different ways. And the course won't cover all of them, but we will cover at least some of the methods that are effective for discovering these relations. [MUSIC]