DataTaunew | comments | leaders | submitlogin
3 points by rent0n 3391 days ago | link | parent

Does anyone know how the PCA scatter plot with histograms was generated? In other words how did he add the histograms to the scatter plot?

2 points by astrobiased 3385 days ago | link

Here's the code I used to generate the PCA plot.

    import matplotlib.pyplot as plt
    import seaborn as sns
    from sklearn.decomposition import PCA as sklearnPCA
    sklearn_pca = sklearnPCA(n_components=2)

    tmp = np.array(df) #df is a Pandas DataFrame
    proj = sklearn_pca.fit_transform(tmp)


    g = sns.JointGrid(proj[:,0], proj[:,1], space=0, size=8)
    g.plot_marginals(sns.distplot, kde=False, color=".7", bins=30)
    g.plot_joint(plt.scatter, color=".5", edgecolor="none", alpha=1)
    g.set_axis_labels(xlabel='PC1', ylabel='PC2')


1 point by isms 3391 days ago | link

Check out `jointplot` in the `seaborn` Python library:


Example with hexbins:

Example with kernel density estimate:


RSS | Announcements