Hierarchical Clustering using Euclidean Distance

提供方
Coursera Project Network
在此指导项目中,您将:

Understand the importance and usage of the hierarchical clustering using skew profiles.

Locate and process the viral cDNA genome files to calculate the skew profiles.

Understand the theory for using the Pythagorean equation to calculate the Euclidean distance. And apply that using python to build a linkage matrix.

Understand how errors occur, how to avoid them, and resolve their sources.

ClockAbout 75 minutes required for the project and 45 for the other materials (reading and assignment).
Intermediate中级
Cloud无需下载
Video分屏视频
Comment Dots英语(English)
Laptop仅限桌面

By the end of this project, you will create a Python program using a jupyter interface that analyzes a group of viruses and plot a dendrogram based on similarities among them. The dendrogram that you will create will depend on the cumulative skew profile, which in turn depends on the nucleotide composition. You will use complete genome sequences for many viruses including, Corona, SARS, HIV, Zika, Dengue, enterovirus, and West Nile viruses.

您要培养的技能

  • Python Programming
  • Genomics
  • plotting

分步进行学习

在与您的工作区一起在分屏中播放的视频中,您的授课教师将指导您完成每个步骤:

  1. Task 1: Getting Started with Hierarchical Clustering

  2. Task 2: Locate and Process The Data Files

  3. Task 3: Understand The Result Dataset

  4. Task 4: Hierarchical Clustering - Metric

  5. Task 5: Hierarchical Clustering - Ordering & Methods

  6. Task 6: Dendrogram Plotting

  7. Task 7: Dendrogram - Analysis

  8. Task 8: Errors to Avoid

指导项目工作原理

您的工作空间就是浏览器中的云桌面,无需下载

在分屏视频中,您的授课教师会为您提供分步指导

常见问题

常见问题

还有其他问题吗?请访问 学生帮助中心