

{"id":6078,"date":"2017-06-13T15:48:27","date_gmt":"2017-06-13T20:48:27","guid":{"rendered":"https:\/\/rud.is\/b\/?p=6078"},"modified":"2018-03-10T08:01:08","modified_gmt":"2018-03-10T13:01:08","slug":"keeping-users-safe-while-collecting-data","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/","title":{"rendered":"Keeping Users Safe While Collecting Data"},"content":{"rendered":"<p>I caught a mention of this <a href=\"https:\/\/petewarden.com\/2017\/06\/12\/can-you-help-me-gather-open-speech-data\/\">project by Pete Warden<\/a> on Four Short Links today. If his name sounds familiar, he&#8217;s the creator of the <a href=\"http:\/\/www.datasciencetoolkit.org\/\">DSTK<\/a>, an O&#8217;Reilly author, and now works at Google. A decidedly clever and decent chap.<\/p>\n<p>The project goal is noble: crowdsource and make a repository of open speech data for researchers to make a better world. Said sourcing is done by asking folks to record themselves saying &#8220;Yes&#8221;, &#8220;No&#8221; and other short words.<\/p>\n<p>As I meandered over the blog post I looked in horror on the URL for the application that did the recording: <code>https:\/\/open-speech-commands.appspot.com\/<\/code>.<\/p>\n<p>Why would the goal of the project combined with that URL give pause? Read on!<\/p>\n<h3>You&#8217;ve Got Scams!<\/h3>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"6079\" data-permalink=\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/cursor_and___development_scamtracker_-_master_-_rstudio\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&amp;ssl=1\" data-orig-size=\"2066,910\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Cursor_and___Development_scamtracker_-_master_-_RStudio\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=300%2C132&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=510%2C225&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?resize=510%2C225&#038;ssl=1\" alt=\"\" width=\"510\" height=\"225\" class=\"aligncenter size-full wp-image-6079\" \/><\/a><\/p>\n<p>Picking up the phone and saying something as simple as &#8216;Yes&#8217; has been <a href=\"https:\/\/www.forbes.com\/forbes\/welcome\/?toURL=https:\/\/www.forbes.com\/sites\/kellyphillipserb\/2017\/01\/30\/answering-one-simple-question-could-make-you-a-victim-in-latest-scam\/&amp;refURL=https:\/\/rud.is\/b&amp;referrer=https:\/\/rud.is\/b#e81f84a26e03\">a major scam<\/a> this year. By recording your voice, attackers can replay it on phone prompts and because it&#8217;s <em>your voice<\/em> it makes it harder to refute the evidence and can foil recognition systems that look for your actual voice.<\/p>\n<p>As the chart above shows, the Better Business Bureau has logged over 5,000 of these scams this year (searching for &#8216;phishing&#8217; and &#8216;yes&#8217;). You can play with the data (a bit &#8212; the package needs work) in R with <a href=\"https:\/\/github.com\/hrbrmstr\/scamtracker\"><code>scamtracker<\/code><\/a>.<\/p>\n<p>Now, these are &#8220;analog&#8221; attacks (i.e. a human spends time socially engineering a human). Bookmark this as you peruse section 2.<\/p>\n<h3>Integrity Challenges in 2017<\/h3>\n<p>I &#8220;trust&#8221; Pete&#8217;s intentions, but I sure don&#8217;t trust <code>open-speech-commands.appspot.com<\/code> (and, you shouldn&#8217;t either). Why? Go visit <span class=\"removed_link\" title=\"https:\/\/totally-harmless-app.appspot.com\">https:\/\/totally-harmless-app.appspot.com<\/span>. It&#8217;s a Google App Engine app I made for this post. Anyone can make an appspot app and the <code>https<\/code> is meaningless as far as integrity &amp; authenticity goes since I&#8217;m running on google&#8217;s infrastructure but I&#8217;m not google.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"6080\" data-permalink=\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/cursor\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor.png?fit=456%2C335&amp;ssl=1\" data-orig-size=\"456,335\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Cursor\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor.png?fit=300%2C220&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor.png?fit=456%2C335&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor.png?resize=456%2C335&#038;ssl=1\" alt=\"\" width=\"456\" height=\"335\" class=\"aligncenter size-full wp-image-6080\" \/><\/a><\/p>\n<p>You can&#8217;t really trust most SSL\/TLS sessions as far as site integrity goes anyway. Let&#8217;s Encrypt put the final nail in the coffin with their Certs Gone Wild! initiative. With super-recent browser updates you can <a href=\"https:\/\/rud.is\/b\/2017\/04\/17\/when-homoglyphs-attack-generating-phishing-domain-names-with-r\/\">almost trust your eyes<\/a> again when it comes to URLs, but you should be very wary of entering your info &#8212;\u00a0especially uploading voice, prints or eye\/face images &#8212; into any input box on any site if you aren&#8217;t 100% sure it&#8217;s a legit site that you trust.<\/p>\n<h3>Tracking the Trackers<\/h3>\n<p>If you don&#8217;t know that <a href=\"http:\/\/observer.com\/2016\/07\/the-truth-about-data-mining-how-online-trackers-gather-your-info-and-what-they-see\/\">you&#8217;re being tracked 100% of the time on the internet<\/a> then you really need to read up on the modern internet.<\/p>\n<p>In many cases your IP address can directly identify you. In most cases your device &amp; browser profile &#8212; which most commercial sites log &#8212; can directly identify you. So, just visiting  a web site means that it&#8217;s highly likely that web site can know that you are both not a dog and are in fact <em>you<\/em>.<\/p>\n<h3>Still Waiting for the &#8220;So, What?&#8221;<\/h3>\n<p>Many states and municipalities have engaged in awareness campaigns to warn citizens about the &#8220;Say &#8216;Yes'&#8221; scam. Asking someone to record themselves saying &#8216;Yes&#8217; into a random web site pretty much negates that advice.<\/p>\n<p>Folks like me regularly warn about trust on the internet.  I could have cloned the functionality of the original site to <code>open-speech-commmands.appspot.com<\/code>. Did you even catch the 3rd &#8216;m&#8217; there? Even without that, it&#8217;s an <code>appspot.com<\/code> domain. Anyone can set one up.<\/p>\n<p>Even if the site doesn&#8217;t ask for your name or other info and just asks for your &#8216;Yes&#8217;, it can know who <em>you<\/em> are. In fact, when you&#8217;re enabling the microphone to do the recording, it could even take a picture of you if it wanted to (and you&#8217;d likely not know or not object since it&#8217;s for SCIENCE!).<\/p>\n<p>So, in the worst case scenario a malicious entity could be asking you for your &#8216;Yes&#8217;, tying it right to you and then executing the post-scam attacks that were being performed in the analog version.<\/p>\n<p>But, go so far as to assume this is a legit site with good intentions. Do you really know what&#8217;s being logged when you commit your voice info? If the data was mishandled, it would be just as easy to tie the voice files back to you (assuming a certain level of data logging).<\/p>\n<p>The &#8220;so what&#8221; is not really a warning to users but a message to researchers: You need to <a href=\"https:\/\/en.wikipedia.org\/wiki\/Threat_model\">threat model<\/a> your experiments and research initiatives, especially when innocent end users are potentially being put at risk. Data is the new gold, diamonds and other precious bits that attackers are after. You may think you&#8217;re not putting folks at risk and aren&#8217;t even a hacker target, but how you design data gathering can reinforce good or bad behaviour on the part of users. It can solidify solid security messages or tear them down. And, you and your data may be more of a target than you really know.<\/p>\n<p>Reach out to interdisciplinary colleagues to help threat model your data collection, storage and dissemination methods to ensure you aren&#8217;t putting yourself or others at risk.<\/p>\n<h3>FIN<\/h3>\n<p>Pete did the right thing:<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Open_Speech_Recording.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"6081\" data-permalink=\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/open_speech_recording\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Open_Speech_Recording.png?fit=1424%2C348&amp;ssl=1\" data-orig-size=\"1424,348\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Open_Speech_Recording\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Open_Speech_Recording.png?fit=300%2C73&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Open_Speech_Recording.png?fit=510%2C125&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Open_Speech_Recording.png?resize=510%2C125&#038;ssl=1\" alt=\"\" width=\"510\" height=\"125\" class=\"aligncenter size-full wp-image-6081\" \/><\/a><\/p>\n<p>and, I&#8217;m sure the site will be on a &#8220;proper&#8221; domain soon. When it is, I&#8217;ll be one of the first in line to help make a much-needed open data set for research purposes.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I caught a mention of this project by Pete Warden on Four Short Links today. If his name sounds familiar, he&#8217;s the creator of the DSTK, an O&#8217;Reilly author, and now works at Google. A decidedly clever and decent chap. The project goal is noble: crowdsource and make a repository of open speech data for [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6079,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[664,754,3,91,646],"tags":[810],"class_list":["post-6078","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-appsec","category-data-science","category-information-security","category-r","category-security-awareness","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Keeping Users Safe While Collecting Data - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Keeping Users Safe While Collecting Data - rud.is\" \/>\n<meta property=\"og:description\" content=\"I caught a mention of this project by Pete Warden on Four Short Links today. If his name sounds familiar, he&#8217;s the creator of the DSTK, an O&#8217;Reilly author, and now works at Google. A decidedly clever and decent chap. The project goal is noble: crowdsource and make a repository of open speech data for [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2017-06-13T20:48:27+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-10T13:01:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"2066\" \/>\n\t<meta property=\"og:image:height\" content=\"910\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Keeping Users Safe While Collecting Data\",\"datePublished\":\"2017-06-13T20:48:27+00:00\",\"dateModified\":\"2018-03-10T13:01:08+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\"},\"wordCount\":895,\"commentCount\":2,\"publisher\":{\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1\",\"keywords\":[\"post\"],\"articleSection\":[\"AppSec\",\"data science\",\"Information Security\",\"R\",\"Security Awareness\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\",\"url\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\",\"name\":\"Keeping Users Safe While Collecting Data - rud.is\",\"isPartOf\":{\"@id\":\"https:\/\/rud.is\/b\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1\",\"datePublished\":\"2017-06-13T20:48:27+00:00\",\"dateModified\":\"2018-03-10T13:01:08+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1\",\"width\":2066,\"height\":910},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/rud.is\/b\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Keeping Users Safe While Collecting Data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/rud.is\/b\/#website\",\"url\":\"https:\/\/rud.is\/b\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/rud.is\/b\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\/\/rud.is\"],\"url\":\"https:\/\/rud.is\/b\/author\/hrbrmstr\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Keeping Users Safe While Collecting Data - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/","og_locale":"en_US","og_type":"article","og_title":"Keeping Users Safe While Collecting Data - rud.is","og_description":"I caught a mention of this project by Pete Warden on Four Short Links today. If his name sounds familiar, he&#8217;s the creator of the DSTK, an O&#8217;Reilly author, and now works at Google. A decidedly clever and decent chap. The project goal is noble: crowdsource and make a repository of open speech data for [&hellip;]","og_url":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/","og_site_name":"rud.is","article_published_time":"2017-06-13T20:48:27+00:00","article_modified_time":"2018-03-10T13:01:08+00:00","og_image":[{"width":2066,"height":910,"url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1","type":"image\/png"}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Keeping Users Safe While Collecting Data","datePublished":"2017-06-13T20:48:27+00:00","dateModified":"2018-03-10T13:01:08+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/"},"wordCount":895,"commentCount":2,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1","keywords":["post"],"articleSection":["AppSec","data science","Information Security","R","Security Awareness"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/","url":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/","name":"Keeping Users Safe While Collecting Data - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1","datePublished":"2017-06-13T20:48:27+00:00","dateModified":"2018-03-10T13:01:08+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1","width":2066,"height":910},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2017\/06\/13\/keeping-users-safe-while-collecting-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Keeping Users Safe While Collecting Data"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/Cursor_and___Development_scamtracker_-_master_-_RStudio.png?fit=2066%2C910&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p23idr-1A2","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":125,"url":"https:\/\/rud.is\/b\/2011\/02\/14\/metricon-critical-consumption-of-infosec-statistics\/","url_meta":{"origin":6078,"position":0},"title":"Metricon: Critical Consumption Of Infosec Statistics","author":"hrbrmstr","date":"2011-02-14","format":false,"excerpt":"Speaker: Chris Eng \/ Veracode Every major infosec company publishes quarterly\/yearly summary reports. Some based on survey, some based on real captured data. Recognizing the Narrative Every fancy looking infosec metrics report is a marketing vehicle; each has different perspectives; no consistency, but you can figure out the framing by\u2026","rel":"","context":"In &quot;Information Security&quot;","block_context":{"text":"Information Security","link":"https:\/\/rud.is\/b\/category\/information-security\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2871,"url":"https:\/\/rud.is\/b\/2014\/01\/06\/announcing-the-launch-of-the-data-driven-security-blogpodcast\/","url_meta":{"origin":6078,"position":1},"title":"Announcing The Launch Of The Data Driven Security [Blog|Podcast]","author":"hrbrmstr","date":"2014-01-06","format":false,"excerpt":"While you're waiting for the [book](http:\/\/amzn.to\/ddsec) by @jayjacobs & @hrbrmstr to hit the shelves, why not head on over to the inaugural post of the [Data Driven Security Blog](http:\/\/datadrivensecurity.info\/blog) & give a listen to the first episode of the [Data Driven Security Podcast](http:\/\/datadrivensecurity.info\/podcast). The Data Driven Security Blog aspires to\u2026","rel":"","context":"In &quot;Data Analysis&quot;","block_context":{"text":"Data Analysis","link":"https:\/\/rud.is\/b\/category\/data-analysis-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2190,"url":"https:\/\/rud.is\/b\/2013\/02\/27\/follow-upresources-grc-t18-data-analysis-and-visualization-for-security-professionals-rsac\/","url_meta":{"origin":6078,"position":2},"title":"Follow up\/Resources :: GRC-T18 \u2013 Data Analysis and Visualization for Security Professionals #RSAC","author":"hrbrmstr","date":"2013-02-27","format":false,"excerpt":"Many thanks to all who attended the talk @jayjacobs & I gave at RSA on Tuesday, February 26, 2013. It was really great to be able to talk to so many of you afterwards as well. We've enumerated quite a bit of non-slide-but-in-presentation information that we wanted to aggregate into\u2026","rel":"","context":"In &quot;Charts &amp; Graphs&quot;","block_context":{"text":"Charts &amp; Graphs","link":"https:\/\/rud.is\/b\/category\/charts-graphs\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2896,"url":"https:\/\/rud.is\/b\/2014\/02\/09\/data-driven-security-roundup-betapert-shiny-honeypots-passwords-reproducible-research\/","url_meta":{"origin":6078,"position":3},"title":"Data Driven Security Roundup: betaPERT, Shiny, Honeypots, Passwords &#038; Reproducible Research","author":"hrbrmstr","date":"2014-02-09","format":false,"excerpt":"Jay Jacobs (@jayjacobs)\u2014my co-author of the soon-to-be-released book [Data-Driven Security](http:\/\/amzn.to\/ddsec)\u2014& I have been hard at work over at the book's [sister-blog](http:\/\/dds.ec\/blog) cranking out code to help security domain experts delve into the dark art of data science. We've covered quite a bit of ground since January 1st, but I'm using\u2026","rel":"","context":"In &quot;Data Analysis&quot;","block_context":{"text":"Data Analysis","link":"https:\/\/rud.is\/b\/category\/data-analysis-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":13120,"url":"https:\/\/rud.is\/b\/2021\/07\/20\/packet-maze-solving-a-cyberdefenders-pcap-puzzle-with-r-zeek-and-tshark\/","url_meta":{"origin":6078,"position":4},"title":"Packet Maze: Solving a CyberDefenders PCAP Puzzle with R, Zeek, and tshark","author":"hrbrmstr","date":"2021-07-20","format":false,"excerpt":"It was a rainy weekend in southern Maine and I really didn't feel like doing chores, so I was skimming through RSS feeds and noticed a link to a PacketMaze challenge in the latest This Week In 4n6. Since it's also been a while since I've done any serious content\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":9397,"url":"https:\/\/rud.is\/b\/2018\/04\/03\/2018-ieee-security-privacy-filtered-paper-dump\/","url_meta":{"origin":6078,"position":5},"title":"2018 IEEE Security &#038; Privacy (Filtered) Paper Dump","author":"hrbrmstr","date":"2018-04-03","format":false,"excerpt":"The 2018 IEEE Security & Privacy Conference is in May but they've posted their full proceedings and it's better to grab them early than to wait for it to become part of a paid journal offering. There are alot of papers. Not all match my interests but (fortunately?) many did\u2026","rel":"","context":"In &quot;AppSec&quot;","block_context":{"text":"AppSec","link":"https:\/\/rud.is\/b\/category\/appsec\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/6078","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=6078"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/6078\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media\/6079"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=6078"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=6078"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=6078"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}