Forskning
Udskriv Udskriv
Switch language
Region Hovedstaden - en del af Københavns Universitetshospital
Udgivet

SinaPlot: An Enhanced Chart for Simple and Truthful Representation of Single Observations Over Multiple Classes

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Harvard

APA

CBE

MLA

Vancouver

Author

Bibtex

@article{ecea3d88ad7f464cbf33055827fb1c84,
title = "SinaPlot: An Enhanced Chart for Simple and Truthful Representation of Single Observations Over Multiple Classes",
abstract = "Recent developments in data-driven science have led researchers to integrate data from several sources, over diverse experimental procedures, or databases. This alone poses a major challenge in truthfully visualizing data, especially when the number of data points varies between classes. To aid the representation of datasets with differing sample size, we have developed a new type of plot overcoming limitations of current standard visualization charts. SinaPlot is inspired by the strip chart and the violin plot and operates by letting the normalized density of points restrict the jitter along the x-axis. The plot displays the same contour as a violin plot but resembles a simple strip chart for a small number of data points. By normalizing jitter over all classes, the plot provides a fair representation for comparison between classes with a varying number of samples. In this way, the plot conveys information of both the number of data points, the density distribution, outliers and data spread in a very simple, comprehensible, and condensed format. The package for producing the plots is available for R through the CRAN network using base graphics package and as geom for ggplot through ggforce. We also provide access to a web-server accepting excel sheets to produce the plots (http://servers.binf.ku.dk:8890/sinaplot/).",
keywords = "Big data, Bioinformatics, Visualization",
author = "Nikos Sidiropoulos and Sohi, {Sina Hadi} and Pedersen, {Thomas Lin} and Porse, {Bo Torben} and Ole Winther and Nicolas Rapin and Bagger, {Frederik Otzen}",
year = "2018",
month = "5",
day = "17",
doi = "10.1080/10618600.2017.1366914",
language = "English",
pages = "1--4",
journal = "Journal of Computational and Graphical Statistics",
issn = "1061-8600",
publisher = "Taylor & Francis Inc",

}

RIS

TY - JOUR

T1 - SinaPlot

T2 - An Enhanced Chart for Simple and Truthful Representation of Single Observations Over Multiple Classes

AU - Sidiropoulos, Nikos

AU - Sohi, Sina Hadi

AU - Pedersen, Thomas Lin

AU - Porse, Bo Torben

AU - Winther, Ole

AU - Rapin, Nicolas

AU - Bagger, Frederik Otzen

PY - 2018/5/17

Y1 - 2018/5/17

N2 - Recent developments in data-driven science have led researchers to integrate data from several sources, over diverse experimental procedures, or databases. This alone poses a major challenge in truthfully visualizing data, especially when the number of data points varies between classes. To aid the representation of datasets with differing sample size, we have developed a new type of plot overcoming limitations of current standard visualization charts. SinaPlot is inspired by the strip chart and the violin plot and operates by letting the normalized density of points restrict the jitter along the x-axis. The plot displays the same contour as a violin plot but resembles a simple strip chart for a small number of data points. By normalizing jitter over all classes, the plot provides a fair representation for comparison between classes with a varying number of samples. In this way, the plot conveys information of both the number of data points, the density distribution, outliers and data spread in a very simple, comprehensible, and condensed format. The package for producing the plots is available for R through the CRAN network using base graphics package and as geom for ggplot through ggforce. We also provide access to a web-server accepting excel sheets to produce the plots (http://servers.binf.ku.dk:8890/sinaplot/).

AB - Recent developments in data-driven science have led researchers to integrate data from several sources, over diverse experimental procedures, or databases. This alone poses a major challenge in truthfully visualizing data, especially when the number of data points varies between classes. To aid the representation of datasets with differing sample size, we have developed a new type of plot overcoming limitations of current standard visualization charts. SinaPlot is inspired by the strip chart and the violin plot and operates by letting the normalized density of points restrict the jitter along the x-axis. The plot displays the same contour as a violin plot but resembles a simple strip chart for a small number of data points. By normalizing jitter over all classes, the plot provides a fair representation for comparison between classes with a varying number of samples. In this way, the plot conveys information of both the number of data points, the density distribution, outliers and data spread in a very simple, comprehensible, and condensed format. The package for producing the plots is available for R through the CRAN network using base graphics package and as geom for ggplot through ggforce. We also provide access to a web-server accepting excel sheets to produce the plots (http://servers.binf.ku.dk:8890/sinaplot/).

KW - Big data

KW - Bioinformatics

KW - Visualization

UR - http://www.scopus.com/inward/record.url?scp=85047130452&partnerID=8YFLogxK

U2 - 10.1080/10618600.2017.1366914

DO - 10.1080/10618600.2017.1366914

M3 - Journal article

SP - 1

EP - 4

JO - Journal of Computational and Graphical Statistics

JF - Journal of Computational and Graphical Statistics

SN - 1061-8600

ER -

ID: 55136895