t-SNE stands for t-Distributed Stochastic Neighbor Embedding (t-SNE) and is a popular technique for dimensionality reduction. The technique was introduced by van der Maaten and Hinton in 2008. T-SNE is particularly well suited for the visualization of high-dimensional genomic or proteomic datasets (e.g. gene expression, mass spectrometry, etc).
The most popular used method in genomics/proteomics literature for dimensionality reduction is the Principal Component Analysis (PCA). However, PCA might not be the best method as it is a linear and parametric method. Low-dimensional maps resulting from a PCA analysis have been used as an input to clustering algorithms, but in fact, PCA is not necessarily a method primarily developed for clustering and even dimension reduction. Lior Pachter post explains very well what PCA is: http://liorpachter.wordpress.com/2014/05/26/what-is-principal-component-analysis/
Recently, t-SNE (a non-linear and non-parametric method) has been gaining some popularity in the genomics and proteomics field.
t-SNE is often used to embed high-dimensional data into low dimensions for visualisation. (Fonville et al., 2013) have shown that
t-SNE outperforms PCA and when used in the visual analysis of high-dimensional molecular data. You can easily use the t-SNE method in R with the “tsne” R package, or read this blog post to start.
How to use t-SNE effectively:
I found a fantastic blog post that I think everyone should read when using t-SNE: