

{"id":3010,"date":"2014-09-10T09:59:17","date_gmt":"2014-09-10T14:59:17","guid":{"rendered":"http:\/\/rud.is\/b\/?p=3010"},"modified":"2018-03-07T16:44:24","modified_gmt":"2018-03-07T21:44:24","slug":"r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/","title":{"rendered":"R version of &#8220;An exploratory technique for visualizing the distributions of 100 variables:&#8221;"},"content":{"rendered":"<p>Rick Wicklin (@[RickWicklin](https:\/\/twitter.com\/RickWicklin)) made a<br \/>\nrecent post to the SAS blog on<br \/>\n[An exploratory technique for visualizing the distributions of 100<br \/>\nvariables](http:\/\/blogs.sas.com\/content\/iml\/). It&#8217;s a very succinct tutorial on both the power of<br \/>\nboxplots and how to make them in SAS (of course). I&#8217;m not one to let R<br \/>\nbe &#8220;out-boxed&#8221;, so I threw together a quick re-creation of his example,<br \/>\nmostly as tutorial for any nascent R folks that come across it. (As an<br \/>\naside, I catch Rick&#8217;s and other cool, non-R stuff via the [Stats<br \/>\nBlogs](http:\/\/www.statsblogs.com\/) blog aggregator.)<\/p>\n<p>The R implementation (syntax notwithstanding) is extremely similar.<br \/>\nFirst, we&#8217;ll need some packages to assist with data reshaping and pretty<br \/>\nplotting:<\/p>\n<pre lang=\"rsplus\">\r\nlibrary(reshape2)\r\nlibrary(ggplot2)\r\n<\/pre>\n<p>Then, we setup a list so we can pick from the same four distributions<br \/>\nand set the random seed to make this example reproducible:<\/p>\n<pre lang=\"rsplus\">\r\ndists <- c(rnorm, rexp, rlnorm, runif)\r\n\r\nset.seed(1492)\r\n<\/pre>\n<p>Now, we generate a data frame of the `100` variables with `1,000`<br \/>\nobservations, normalized from `0`-`1`:<\/p>\n<pre lang=\"rsplus\">\r\nmany_vars <- data.frame(sapply(1:100, function(x) {\r\n  \r\n  # generate 1,000 random samples\r\n  tmp <- sample(dists, 1)[[1]](1000)\r\n  \r\n  # normalize them to be between 0 &#038; 1\r\n  (tmp - min(tmp)) \/ (max(tmp) - min(tmp))\r\n  \r\n}))\r\n<\/pre>\n<p>The `sapply` iterates over the numbers `1` through `100`, passing each<br \/>\nnumber into a function. Each iteration samples an object from the<br \/>\n`dists` list (which are actual R functions) and then calls the function,<br \/>\ntelling it to generate `1,000` samples and normalize the result to be<br \/>\nvalues between `0` & `1`. By default, R will generate column names that<br \/>\nbegin with `X`:<\/p>\n<pre lang=\"rsplus\">\r\nstr(many_vars[1:5]) # show the structure of the first 5 cols\r\n\r\n## 'data.frame':    1000 obs. of  5 variables:\r\n##  $ X1: num  0.1768 0.4173 0.5111 0.0319 0.0644 ...\r\n##  $ X2: num  0.217 0.275 0.596 0.785 0.825 ...\r\n##  $ X3: num  0.458 0.637 0.115 0.468 0.469 ...\r\n##  $ X4: num  0.5186 0.0358 0.5927 0.1138 0.1514 ...\r\n##  $ X5: num  0.2855 0.0786 0.2193 0.433 0.9634 ...\r\n<\/pre>\n<p>We're going to plot the boxplots, sorted by the third quantile (just<br \/>\nlike in Rick's example), so we'll calculate their rank and then use<br \/>\nthose ranks (shortly) to order a factor varible:<\/p>\n<pre lang=\"rsplus\">\r\nranks <- names(sort(rank(sapply(colnames(many_vars), function(x) {\r\n  as.numeric(quantile(many_vars[,x], 0.75))\r\n}))))\r\n<\/pre>\n<p>There's alot going on in there. We pass the column names from the<br \/>\n`many_vars` data frame to a function that will return the quantile we<br \/>\nwant. Since `sapply` preserves the names we passed in as well as the<br \/>\nvalues, we extract them (via `names`) after we rank and sort the named<br \/>\nvector, giving us a character vector in the order we'll need:<\/p>\n<pre lang=\"rsplus\">\r\nstr(ranks)\r\n\r\n##  chr [1:100] \"X29\" \"X8\" \"X92\" \"X43\" \"X11\" \"X52\" \"X34\" ...\r\n<\/pre>\n<p>Just like in the SAS post, we'll need to reshape the data into [long<br \/>\nformat from wide<br \/>\nformat](http:\/\/www.cookbook-r.com\/Manipulating_data\/Converting_data_between_wide_and_long_format\/),<br \/>\nwhich we can do with `melt`:<\/p>\n<pre lang=\"rsplus\">\r\nmany_vars_m <- melt(as.matrix(many_vars))\r\n\r\nstr(many_vars_m)\r\n\r\n## 'data.frame':    100000 obs. of  3 variables:\r\n##  $ Var1 : int  1 2 3 4 5 6 7 8 9 10 ...\r\n##  $ Var2 : Factor w\/ 100 levels \"X1\",\"X2\",\"X3\",..: 1 1 1 1 1 1 1 1 1 1 ...\r\n##  $ value: num  0.1768 0.4173 0.5111 0.0319 0.0644 ...\r\n<\/pre>\n<p>And, now we'll use our ordered column names to ensure that our boxplots<br \/>\nwill be presented in the right order (it would be in alpha order if<br \/>\nnot). Factor variables in R are space-efficient and allow for handy<br \/>\nmanipulations like this (amongst other things). By default,<br \/>\n`many_vars_m$Var2` was in alpha order and this call just re-orders that<br \/>\nfactor.<\/p>\n<pre lang=\"rsplus\">\r\nmany_vars_m$Var2 <- factor(many_vars_m$Var2, ranks)\r\n\r\nstr(many_vars_m)\r\n\r\n## 'data.frame':    100000 obs. of  3 variables:\r\n##  $ Var1 : int  1 2 3 4 5 6 7 8 9 10 ...\r\n##  $ Var2 : Factor w\/ 100 levels \"X29\",\"X8\",\"X92\",..: 24 24 24 24 24 24 24 24 24 24 ...\r\n##  $ value: num  0.1768 0.4173 0.5111 0.0319 0.0644 ...\r\n<\/pre>\n<p>Lastly, we plot all our hard work (click\/touch for larger version):<\/p>\n<pre lang=\"rsplus\">\r\ngg <- ggplot(many_vars_m, aes(x=Var2, y=value))\r\ngg <- gg + geom_boxplot(fill=\"#BDD7E7\", notch=TRUE, outlier.size=1)\r\ngg <- gg + labs(x=\"\")\r\ngg <- gg + theme_bw()\r\ngg <- gg + theme(panel.grid=element_blank())\r\ngg <- gg + theme(axis.text.x=element_text(angle=-45, hjust=0.001, size=5))\r\ngg\r\n<\/pre>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"3013\" data-permalink=\"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/unnamed-chunk-9\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?fit=1056%2C480&amp;ssl=1\" data-orig-size=\"1056,480\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"unnamed-chunk-9\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?fit=510%2C231&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?resize=510%2C231&#038;ssl=1\" alt=\"unnamed-chunk-9\" width=\"510\" height=\"231\" class=\"aligncenter size-large wp-image-3013\" srcset=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?resize=530%2C240&amp;ssl=1 530w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?resize=150%2C68&amp;ssl=1 150w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?resize=300%2C136&amp;ssl=1 300w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?resize=535%2C243&amp;ssl=1 535w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?w=1056&amp;ssl=1 1056w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/a><\/p>\n<p>Here's the program in it's entirety:<\/p>\n<pre lang=\"rsplus\">\r\nlibrary(reshape2)\r\nlibrary(ggplot2)\r\n\r\ndists <- c(rnorm, rexp, rlnorm, runif)\r\n\r\nset.seed(1)\r\nmany_vars <- data.frame(sapply(1:100, function(x) {\r\n  tmp <- sample(dists, 1)[[1]](1000)\r\n  (tmp - min(tmp)) \/ (max(tmp) - min(tmp))\r\n}))\r\n\r\nranks <- names(sort(rank(sapply(colnames(many_vars), function(x) {\r\n  as.numeric(quantile(many_vars[,x], 0.75))\r\n}))))\r\n\r\nmany_vars_m <- melt(as.matrix(many_vars))\r\n\r\nmany_vars_m$Var2 <- factor(many_vars_m$Var2, ranks)\r\n\r\ngg <- ggplot(many_vars_m, aes(x=Var2, y=value))\r\ngg <- gg + geom_boxplot(fill=\"steelblue\", notch=TRUE, outlier.size=1)\r\ngg <- gg + labs(x=\"\")\r\ngg <- gg + theme_bw()\r\ngg <- gg + theme(panel.grid=element_blank())\r\ngg <- gg + theme(axis.text.x=element_text(angle=-45, hjust=0.001))\r\ngg\r\n<\/pre>\n<p>I tweaked the boxplot, using a<br \/>\n[notch](https:\/\/sites.google.com\/site\/davidsstatistics\/home\/notched-box-plots)<br \/>\nand making the outliers take up a fewer pixels.<\/p>\n<p>I'm definitely in agreement with Rick that this is an excellent way to<br \/>\ncompare many distributions.<\/p>\n<p>Bonus points for the commenter who shows code to color the bars by which distribution generated them!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Rick Wicklin (@[RickWicklin](https:\/\/twitter.com\/RickWicklin)) made a recent post to the SAS blog on [An exploratory technique for visualizing the distributions of 100 variables](http:\/\/blogs.sas.com\/content\/iml\/). It&#8217;s a very succinct tutorial on both the power of boxplots and how to make them in SAS (of course). I&#8217;m not one to let R be &#8220;out-boxed&#8221;, so I threw together a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[24,677,678,673,674,91],"tags":[810],"class_list":["post-3010","post","type-post","status-publish","format-standard","hentry","category-charts-graphs","category-data-analysis-2","category-data-visualization","category-datavis-2","category-dataviz","category-r","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>R version of &quot;An exploratory technique for visualizing the distributions of 100 variables:&quot; - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"R version of &quot;An exploratory technique for visualizing the distributions of 100 variables:&quot; - rud.is\" \/>\n<meta property=\"og:description\" content=\"Rick Wicklin (@[RickWicklin](https:\/\/twitter.com\/RickWicklin)) made a recent post to the SAS blog on [An exploratory technique for visualizing the distributions of 100 variables](http:\/\/blogs.sas.com\/content\/iml\/). It&#8217;s a very succinct tutorial on both the power of boxplots and how to make them in SAS (of course). I&#8217;m not one to let R be &#8220;out-boxed&#8221;, so I threw together a [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2014-09-10T14:59:17+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-07T21:44:24+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9-530x240.png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"R version of &#8220;An exploratory technique for visualizing the distributions of 100 variables:&#8221;\",\"datePublished\":\"2014-09-10T14:59:17+00:00\",\"dateModified\":\"2018-03-07T21:44:24+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/\"},\"wordCount\":511,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2014\\\/09\\\/unnamed-chunk-9-530x240.png\",\"keywords\":[\"post\"],\"articleSection\":[\"Charts &amp; Graphs\",\"Data Analysis\",\"Data Visualization\",\"DataVis\",\"DataViz\",\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/\",\"name\":\"R version of \\\"An exploratory technique for visualizing the distributions of 100 variables:\\\" - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2014\\\/09\\\/unnamed-chunk-9-530x240.png\",\"datePublished\":\"2014-09-10T14:59:17+00:00\",\"dateModified\":\"2018-03-07T21:44:24+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2014\\\/09\\\/unnamed-chunk-9.png?fit=1056%2C480&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2014\\\/09\\\/unnamed-chunk-9.png?fit=1056%2C480&ssl=1\",\"width\":1056,\"height\":480},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2014\\\/09\\\/10\\\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"R version of &#8220;An exploratory technique for visualizing the distributions of 100 variables:&#8221;\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"R version of \"An exploratory technique for visualizing the distributions of 100 variables:\" - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/","og_locale":"en_US","og_type":"article","og_title":"R version of \"An exploratory technique for visualizing the distributions of 100 variables:\" - rud.is","og_description":"Rick Wicklin (@[RickWicklin](https:\/\/twitter.com\/RickWicklin)) made a recent post to the SAS blog on [An exploratory technique for visualizing the distributions of 100 variables](http:\/\/blogs.sas.com\/content\/iml\/). It&#8217;s a very succinct tutorial on both the power of boxplots and how to make them in SAS (of course). I&#8217;m not one to let R be &#8220;out-boxed&#8221;, so I threw together a [&hellip;]","og_url":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/","og_site_name":"rud.is","article_published_time":"2014-09-10T14:59:17+00:00","article_modified_time":"2018-03-07T21:44:24+00:00","og_image":[{"url":"https:\/\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9-530x240.png","type":"","width":"","height":""}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"R version of &#8220;An exploratory technique for visualizing the distributions of 100 variables:&#8221;","datePublished":"2014-09-10T14:59:17+00:00","dateModified":"2018-03-07T21:44:24+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/"},"wordCount":511,"commentCount":1,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#primaryimage"},"thumbnailUrl":"https:\/\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9-530x240.png","keywords":["post"],"articleSection":["Charts &amp; Graphs","Data Analysis","Data Visualization","DataVis","DataViz","R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/","url":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/","name":"R version of \"An exploratory technique for visualizing the distributions of 100 variables:\" - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#primaryimage"},"thumbnailUrl":"https:\/\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9-530x240.png","datePublished":"2014-09-10T14:59:17+00:00","dateModified":"2018-03-07T21:44:24+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?fit=1056%2C480&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2014\/09\/unnamed-chunk-9.png?fit=1056%2C480&ssl=1","width":1056,"height":480},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2014\/09\/10\/r-version-of-an-exploratory-technique-for-visualizing-the-distributions-of-100-variables\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"R version of &#8220;An exploratory technique for visualizing the distributions of 100 variables:&#8221;"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-My","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":12412,"url":"https:\/\/rud.is\/b\/2019\/07\/15\/quick-hit-a-different-diminutive-look-at-distributions-with-ggeconodist\/","url_meta":{"origin":3010,"position":0},"title":"Quick Hit: A Different (Diminutive) Look At Distributions With {ggeconodist}","author":"hrbrmstr","date":"2019-07-15","format":false,"excerpt":"Despite being a full-on denizen of all things digital I receive a fair number of dead-tree print magazines as there's nothing quite like seeing an amazing, large, full-color print data-driven visualization up close and personal. I also like supporting data journalism through the subscriptions since without cash we will only\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/07\/gm-1.png?fit=600%2C700&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/07\/gm-1.png?fit=600%2C700&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/07\/gm-1.png?fit=600%2C700&ssl=1&resize=525%2C300 1.5x"},"classes":[]},{"id":3364,"url":"https:\/\/rud.is\/b\/2015\/03\/30\/3364\/","url_meta":{"origin":3010,"position":1},"title":"A look at airline crashes in R with googlesheets, dplyr &#038; ggplot2","author":"hrbrmstr","date":"2015-03-30","format":false,"excerpt":"Over on The DO Loop, @RickWicklin does a nice job [visualizing the causes of airline crashes](http:\/\/blogs.sas.com\/content\/iml\/2015\/03\/30\/visualizing-airline-crashes\/) in SAS using a mosaic plot. More often than not, I find mosaic plots can be a bit difficult to grok, but Rick's use was spot on and I believe it shows the data\u2026","rel":"","context":"In &quot;Charts &amp; Graphs&quot;","block_context":{"text":"Charts &amp; Graphs","link":"https:\/\/rud.is\/b\/category\/charts-graphs\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":6115,"url":"https:\/\/rud.is\/b\/2017\/07\/25\/r%e2%81%b6-general-attys-distributions\/","url_meta":{"origin":3010,"position":2},"title":"R\u2076 \u2014 General (Attys) Distributions","author":"hrbrmstr","date":"2017-07-25","format":false,"excerpt":"Matt @stiles is a spiffy data journalist at the @latimes and he posted an interesting chart on U.S. Attorneys General longevity (given that the current US AG is on thin ice): Only Watergate and the Civil War have prompted shorter tenures as AG (if Sessions were to leave now). A\u2026","rel":"","context":"In &quot;Data Visualization&quot;","block_context":{"text":"Data Visualization","link":"https:\/\/rud.is\/b\/category\/data-visualization\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/07\/plot_zoom_png-2.png?fit=1200%2C1076&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/07\/plot_zoom_png-2.png?fit=1200%2C1076&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/07\/plot_zoom_png-2.png?fit=1200%2C1076&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/07\/plot_zoom_png-2.png?fit=1200%2C1076&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/07\/plot_zoom_png-2.png?fit=1200%2C1076&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":2288,"url":"https:\/\/rud.is\/b\/2013\/03\/10\/visualizing-risky-words-part-3\/","url_meta":{"origin":3010,"position":3},"title":"Visualizing Risky Words \u2014 Part 3","author":"hrbrmstr","date":"2013-03-10","format":false,"excerpt":"The DST changeover in the US has made today a fairly strange one, especially when combined with a very busy non-computing day yesterday. That strangeness manifest as a need to take the D3 heatmap idea mentioned in the [previous post](http:\/\/rud.is\/b\/2013\/03\/09\/visualizing-risky-words-part-2\/) and actually (mostly) implement it. Folks just coming to this\u2026","rel":"","context":"In &quot;d3&quot;","block_context":{"text":"d3","link":"https:\/\/rud.is\/b\/category\/d3\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":7637,"url":"https:\/\/rud.is\/b\/2017\/12\/20\/r%e2%81%b6-series-random-sampling-from-apache-drill-tables-with-r-sergeant\/","url_meta":{"origin":3010,"position":4},"title":"R\u2076 Series \u2014 Random Sampling From Apache Drill Tables With R &#038; sergeant","author":"hrbrmstr","date":"2017-12-20","format":false,"excerpt":"(For first-timers, R\u2076 tagged posts are short & sweet with minimal expository; R\u2076 feed) At work-work I mostly deal with medium-to-large-ish data. I often want to poke at new or existing data sets w\/o working across billions of rows. I also use Apache Drill for much of my exploratory work.\u2026","rel":"","context":"In &quot;Apache Drill&quot;","block_context":{"text":"Apache Drill","link":"https:\/\/rud.is\/b\/category\/apache-drill\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3424,"url":"https:\/\/rud.is\/b\/2015\/05\/18\/scraping-jquery-datatable-programmatic-json-with-r\/","url_meta":{"origin":3010,"position":5},"title":"Scraping jQuery DataTable Programmatic JSON with R","author":"hrbrmstr","date":"2015-05-18","format":false,"excerpt":"School of Data had a recent post how to copy \"every item\" from a multi-page list. While their post did provide a neat hack, their \"words of warning\" are definitely missing some items and the overall methodology can be improved upon with some basic R scripting. First, the technique they\u2026","rel":"","context":"In &quot;Data Analysis&quot;","block_context":{"text":"Data Analysis","link":"https:\/\/rud.is\/b\/category\/data-analysis-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/3010","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=3010"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/3010\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=3010"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=3010"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=3010"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}