[MUSIC] We first introduced STING, the first grid based method which is statistic information in the grid structure. This method called statistic information grid approach. Essentially you can think the gray structure looks like this. You got multiple layers. That means the spacial area's defined into rectangular cells at the different levels of revolution. Then these cells actually form a tree structure. That means each cell at a higher layer, each cell actually store a summary of the lower level set of cells. That means the cell at a higher level contains a number of small cells of the next lower level. To that extent, it forms a hierarchical tree structure. What the cell will store, statistical information of each cell that is calculated and stored beforehand and then can be used to answer queries. That means you can calculate the statistical information that is corresponding set of cells at the lower layer. Then the parameter of the higher level cells, the calculation contains a bunch of statistical information including number of points in the cell which is a count. Then their mean, their standard deviation, the minimum value, the maximum value and also the type of distribution. For example whether these points form a normal distribution, a uniform distribution or other kinds of distributions. So if we store the cell in a hierarchical way, to process a region query including finding clusters, we can start at the root of this grid structure. And proceed to the next lower level using this STING indexing structure. That means we can calculate the likelihood a cell is relevant to the query at some confidence level using the statistical information stored in the cell. Once we are confident, we need to go down, only children of the likely relevant cells are recursively explored. So it is quite localized, it could be quite efficient. We can repeat this process until the bottom layer is reached. The advantage is this structure is query-independent. And the process can be easily paralyzed and we can do incremental update as well. Also the efficiency is very good. That means the complexity is quite low because if the K is number of grid cells at the lowest level. Then the complexity is big O linear to K, where K is much less than N, the number of data points. However, it's disadvantages, because you formed a grid structure, you may lose a lot of details. The probabilistic nature of the higher level may imply a loss of accuracy in the lower level for query processing. [MUSIC]