Most organizations have huge amounts of data stored in many forms in various locations. Finding relevant data quickly and connecting disparate data sources can be challenging and time-consuming. Watson Knowledge Catalog unites all information assets into a single metadata-rich catalog, based on Watsonās understanding of relationships between assets and how theyāre being used and socialized among users in existing projects. Letās have a look at the overview of different tool categories that weāve previously discussed. Watson Knowledge Catalog corresponds to the Data Asset Management, Code Asset Management, Data Management, and Data Integration and Transformation. Watson Knowledge Catalog is a data catalog that is integrated with an enterprise data governance platform. It also merges the analytics capabilities of Watson Studio. The data catalog assists data scientists to easily find, prepare, understand, and use the data as needed. Watson Knowledge Catalog protects data from misuse and enables the sharing of assets with automated, dynamic masking of sensitive data elements. Data-profile visualizations, built-in charts and statistics help users to understand data assets. Seamless integration with Watson Studio helps data citizens to drive production of their data in a suite of powerful data science, AI, machine-learning and deep-learning tools. Joining with Watson Studio directs the building, training, and deploying of models. Users can interactively discover, cleanse, and prepare data with a built-in data refinery. Possible connections to more than 30 IBM and third-party data sources help to catalog and use your data in the original locations. IBM Watson Knowledge Catalog has various deployment choices on IBM Cloud⢠and can be run anywhere with IBM Cloud Pak⢠for Data. The latter is a fully-integrated data and AI platform built on Red HatĀ® OpenShiftĀ® Container base. It can be deployed easily into any public or private cloud or other enterprise platforms. A catalog contains metadata about the contents of assets and how to access them. And a set of collaborators who need to use the assets for data analysis. The metadata is stored in an encrypted IBM Cloud object storage instance. Any data that you want to store in the Cloud, you can upload to the cloud object storage of your choice, and then specify that object storage when you create the catalog. This split between where the data's metadata is stored and the actual location of the data is important. It means that you can keep your data where ever it is. You don't need to move it into the catalog because the catalog only contains metadata. You can have the data in unpremises data repositories in other IBM cloud services like Cloudant or Db2 on Cloud and in non-IBM cloud services like Amazon or Azure, in streaming data services or even dark data sources like PDFs. Included in the metadata is how to access the data asset. In other words, the location and credentials. That means that anyone who is a member of the catalog and has sufficient permissions can get to the data without knowing the credentials or having to create their own connection to the data. Since the new catalog is empty, let's take a look at an existing catalog. On the Browse Assets tab you can see "recommendations", "highly rated assets", and "recently created assets", as well as a list of all the assets. You can type a search term to find assets, and you can filter by asset type, such as Data Asset or Notebook. Or filter by tags that were assigned to the asset when it was added to the catalog. When you view an asset, you get a preview of the data and other information like a description, ratings, tags, where the source is located, and any classifications. On the Access tab, those with permission can add members to view this particular asset. And the Review tab shows reviews and lets you contribute a review. When assets are added to a catalog with Data Policies enabled, Watson Knowledge Catalog automatically profiles and classifies the content of the asset based on the values in those columns. The Profile tab contains more detailed information about the inferred classifications. You can see the other possibilities for classifying each column and the confidence scores for those other possibilities. On the Lineage tab, you'll see the various events that Watson Knowledge Catalog has captured that occurred in the lifecycle of this data asset, allowing you to trace what's happened to the asset since it was created. On the Access Control tab, you can see the current list of catalog members. you can also add members which is pretty similar to adding collaborators in a project. Most catalog members will likely have the editor role. The viewer role is intentionally restricted and only a select few users will have the admin role. Watson Knowledge Catalog includes capabilities to automatically mask sensitive data according to your organization's governance policies. For example, you can see in the diagram that the first name, last name, and gender data in the data set have been masked. Youāve learned how IBM Watson Knowledge Catalog can help organizations deal with their numerous data and other assets. In the next video weāll look at Data Refinery, a powerful tool for analyzing and preparing data.