Use visual diagnostic tools from Yellowbrick to steer your machine learning workflow
Vectorize text data using TF-IDF
Cluster documents using embedding techniques and appropriate metrics
Welcome to this project-based course on Analyzing Text Data with Yellowbrick. Tasks such as assessing document similarity, topic modelling and other text mining endeavors are predicated on the notion of "closeness" or "similarity" between documents. In this course, we define various distance metrics (e.g. Euclidean, Hamming, Cosine, Manhattan, etc) and understand their merits and shortcomings as they relate to document similarity. We will apply these metrics on documents within a specific corpus and visualize our results. By the end of this course, you will be able to confidently use visual diagnostic tools from Yellowbrick to steer your machine learning workflow, vectorize text data using TF-IDF, and cluster documents using embedding techniques and appropriate metrics. This course runs on Coursera's hands-on project platform called Rhyme. On Rhyme, you do projects in a hands-on manner in your browser. You will get instant access to pre-configured cloud desktops containing all of the software and data you need for the project. Everything is already set up directly in your internet browser so you can just focus on learning. For this project, you’ll get instant access to a cloud desktop with Python, Jupyter, Yellowbrick, and scikit-learn pre-installed. Notes: - You will be able to access the cloud desktop 5 times. However, you will be able to access instructions videos as many times as you want. - This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Introduction and Loading the Corpus
Vectorizing the Documents
Clustering Similar Documents with Squared Euclidean Distance And Euclidean Distance
Manhattan (aka “Taxicab” or “City Block”) Distance
Bray Curtis Dissimilarity and Canberra Distance
What Metrics Not to Use
Omitting Class Labels - Using KMeans Clustering
What will I get if I purchase a Guided Project?
By purchasing a Guided Project, you'll get everything you need to complete the Guided Project including access to a cloud desktop workspace through your web browser that contains the files and software you need to get started, plus step-by-step video instruction from a subject matter expert.
Are Guided Projects available on desktop and mobile?
Because your workspace contains a cloud desktop that is sized for a laptop or desktop computer, Guided Projects are not available on your mobile device.
Who are the instructors for Guided Projects?
Guided Project instructors are subject matter experts who have experience in the skill, tool or domain of their project and are passionate about sharing their knowledge to impact millions of learners around the world.
Can I download the work from my Guided Project after I complete it?
You can download and keep any of your created files from the Guided Project. To do so, you can use the “File Browser” feature while you are accessing your cloud desktop.
Financial aid is not available for Guided Projects.
Can I audit a Guided Project and watch the video portion for free?
Auditing is not available for Guided Projects.
How much experience do I need to do this Guided Project?
At the top of the page, you can press on the experience level for this Guided Project to view any knowledge prerequisites. For every level of Guided Project, your instructor will walk you through step-by-step.
Can I complete this Guided Project right through my web browser, instead of installing special software?
Yes, everything you need to complete your Guided Project will be available in a cloud desktop that is available in your browser.
What is the learning experience like with Guided Projects?