Skip navigation

Tag Archives: http

I usually take a peek at the Internet Traffic Report (ITR) a couple times a day as part of my routine and was a bit troubled by all of the red today:

I wanted to do some crunching on the data, and I deliberately do not have Word or Excel on my new MacBook Pro (for reasons I can detail if asked). A SELECT / CUT / PASTE into TextWrangler did not really thrill me and I knew there had to be a way to get non-marked-up, columnar data into a format I could mangle and share easily.

Enter, Google Shreadsheet’s importHTML function.

If you don’t have the forumla bar enabled in Google Spreadsheets, just go to View->Formula Bar to enable it. Once there, enter the following in the formula bar to get the data from the ITR into a set of columns that will auto-update every time you reference the sheet.


(as you can see, it’s not case sensitive, either)

Yes, I know Excel can do this. I could have done a quick script whack the pasted data in TextWrangler. You can do something similar in R with htmlTreeParse + xpathApply and Perl has HTML::TableContentParser (and other handy modules), but this was a fast, easy way to get me to a point where I could do the basic analytics I wanted to perform (and, sometimes, all you need is quick & easy).

Official Google Help page on importHTML.

Feedburner has borked the old RSS feed for the site and has completely disassociated me from it (meaning it’s no longer in my Google Feedburner admin options and they won’t let me re-claim it).

So… the new feed link is

Apologies for any inconvenience.

As you can probably tell from a previous post, I’m not a fan of paywalls—especially poorly implemented ones. Clicking on a link in an RSS feed post and having it land on a page, only to have it smothered in an HTML layer or — in the following case — promptly redirected to “Pay up, buddy!” sites is frustrating at best. I’ll gladly debate the efficacy of paywalls vs other means of generating revenue in another post (or even in the comments, if civil). I primarily wanted to write this post to both show the silliness of the implementation of Foster’s Daily Democrat’s paywall and point out a serious deficiency in Chrome.

First up, lame paywall. You get three free direct story link visits prior to be asked to register and eventually pay for content. NOTE: You could just be going to the same story three times (say, after a browser crash) and each counts as a visit. After those visits, you have to register and give up what little anonymity you have on the Internet to be able to view up to an additional ten free direct story links before then being forced to pay up. If you are a print subscriber, you do get access for “free”, but there’s that tracking thing again. Foster’s uses a service called Clickshare to handle the subscription and tracking. Just how many places do you need to have your data stored/tracked just to read a (most likely) mediocre piece of news?

The paywall setup is accomplished by a simple “Meta Refresh” tag. In its most basic form, it is an instruction that tells the browser to load a particular URL after a certain amount of time. In the case of Foster’s paywall, the HTML tag/directive looks like this:

[code lang=”html”]<meta http-equiv="refresh" content="0;url=…"/>[/code]

It’s telling your browser to double-check with their Clickshare code immediately after teasing you with the article content. And, it’s easy to circumvent. Mostly. The problem is, I’m a Chrome user 99% of the time and Google has not seen fit to allow control over the meta refresh directive. To jump the paywall, you’ll need to fire up Firefox. And enter “about:config” in the location bar (and click through the warning message).

Once there, filter for “refresh”, find the setting for “blockautorefresh” and set it to “true“.

Now, every time you visit a web site that attempts to auto-refresh full browser pages, you’ll see a warning (with the option to allow the action):

Why Chrome has not implemented a way to control this is beyond me. Since Safari also has no ability to control this setting, it may have something to do with the webkit core that both browsers are based on.

This doesn’t stop the frustration with the RSS-click-to-read and it doesn’t help iOS/Android users, but it does provide a means help keep a bit of anonymity (if you also use other extensions and controls) and should force these sites to kick their paywall game up a notch.

Eventbrite site:

It’s at Fort Foster. We thought about it a bit late in the season so no pavillion, but I’ll be there wicked-early and have a gazebo-like covering setup over some tables. I highly suggest bringing folding chairs. There is a nominal entrance fee (cash). Link to Fort Foster is in the Eventbrite page.

I’ll have signs up showing where we’re setup.

Right now (08-16) there are over 30 ppl (including family members) signed up!

I’m making a boatload of pulled bbq (still not decided on chicken or pork) for everyone. It’s potluck, so bring as much or little as you like. I’ll work on a massive bucket of drinks as well (tis busy week here, unfortunately) and will no doubt have spare plates & napkins.

We setup a google spreadsheet if you want to wish to post what you plan to share (if anything…no worries).

I updated the eventbrite posting with my google voice # if you need to reach me by phone for any reason.

Fort Foster has a beach area, scuba diving area, can handle kayaks and has a nice (but smallish) set of hiking trails. There are also grills if you plan on bringing raw items to cook.

If you have any questions, do not hesitate to ask!

Oh, yeah, it will come as a complete surprise, but I’ll have the “shield” on so you can recognize me :-)

What can the @lulzsec dump tell us about how the admins maintained their system/site?

[code light=”true”]SunOS a-ess-wwwi 5.10 Generic_139555-08 sun4u sparc SUNW,SPARC-Enterprise[/code]

means they haven’t kept up with OS patches. [-1 patch management]

[code light=”true”]celerra:/wwwdata 985G 609G 376G 62% /net/celerra/wwwdata[/code]

tells us they use EMC NAS kit for web content.

The ‘last‘ dump shows they were good about using normal logins and (probably) ‘sudo‘, and used ‘root‘ only on the console. [+1 privileged id usage]

They didn’t show the running apache version (just the config file…I guess I could have tried to profile that to figure out a range of version numbers). There’s decent likelihood that it was not at the latest patch version (based on not patching the OS) or major vendor version.

[code light=”true”]Alias /CFIDE /WEBAPPS/Apache/htdocs/CFIDE
Alias /coldfusion /WEBAPPS/Apache/htdocs/coldfusion
LoadModule jrun_module /WEBAPPS/coldfusionmx8/runtime/lib/wsconfig/1/
JRunConfig Bootstrap[/code]

Those and other entries says they are running Cold Fusion, an Adobe web application server/framework, on the same system. The “mx8” suggests an out of date, insecure version. [-1 layered product lifecycle management]

[code light=”true”] SSLEngine on
SSLCertificateFile /home/Apache/bin/
SSLCertificateKeyFile /home/Apache/bin/
SSLCACertificateFile /home/Apache/bin/sslintermediate.crt[/code]

(along with the file system listing) suggests the @lulzsec folks have everything they need to host fake SSL web sites impersonating


[code light=”true”]LoadModule security_module modules/

<IfModule mod_security.c>
# Turn the filtering engine On or Off
SecFilterEngine On

# Make sure that URL encoding is valid
SecFilterCheckURLEncoding On

# Unicode encoding check
SecFilterCheckUnicodeEncoding Off

# Only allow bytes from this range
SecFilterForceByteRange 0 255

# Only log suspicious requests
SecAuditEngine RelevantOnly

# The name of the audit log file
SecAuditLog logs/audit_log

# Debug level set to a minimum
SecFilterDebugLog logs/modsec_debug_log    
SecFilterDebugLevel 0

# Should mod_security inspect POST payloads
SecFilterScanPOST On

# By default log and deny suspicious requests
# with HTTP status 500
SecFilterDefaultAction &quot;deny,log,status:500&quot;


shows they had a built-in WAF available, but either did not configure it well enough or did not view the logs from it. [-10 checkbox compliance vs security]

[code light=”true”]-rw-r–r– 1 cfmx 102 590654 Feb 3 2006 66_00064d.jpg[/code]

(many entries with ‘102’ instead of a group name) shows they did not do identity & access management configurations well. [-1 IDM]

The apache config file discloses pseudo-trusted IP addresses & hosts (and we can assume @lulzsec has the passwords as well).

As I tweeted in the wee hours of the morning, this was a failure on many levels since they did not:

  • Develop & use secure configuration of their servers & layered products + web applications
  • Patch their operating systems
  • Patch their layered products

They did have a WAF, but it wasn’t configured well and they did not look at the WAF logs or – again, most likely – any system logs. This may have been a case where those “white noise port scans” everyone ignores was probably the intelligence probe that helped bring this box down.

Is this a terrible breach of government security? No. It’s a public web server with public data. They may have gotten to a firewalled zone, but it’s pretty much a given that no sensitive systems were on that same segment. This is just an embarrassment with a bit of extra badness in that the miscreants have SSL certs. It does show just how important it is to make sure server admins maintain systems well (note, I did not say security admins) and that application teams keep current, too. It also shows that we should be looking at all that log content we collect.

This wasn’t the first @lulzsec hack and it will not be the last. They are providing a good reminder to organizations to take their external network presence seriously.

Nevena Vratonjic
Julien Freudiger
Vincent Bindschaedler
Jeane-Pierre Hubaux

Presentation [PDF]

Twitter transcript

#weis2011 Overview of basic ssl/tls/https concepts. Asking: how prevalent is https, what are problems with https?

#weis2011 Out of their large sample, only 1/3 (34.7%) have support for https, login is worse! only 22.6% < #data!

#weis2011 (me) just like Microsoft for patches/vulns, everyone uses Bank of America for https & identity examples. #sigh

#weis2011 More Certificates 101, but a good venn diagram explaining what authentication success looks like w/%ages. rly good visualization.

#weis2011 domain mismatch accounts for over 80% of certificate authentication failures. why? improper reuse. it has a simple solution (SNI)

#weis2011 the team did a very thorough analysis that puts data behind what most folks have probably assumed. #dataisspiffy

#weis2011 We've created a real mess for users with certs. EV certs help, but are expensive and not pervasive (***6%***!)

#weis2011 economics don't back good cert issuance practices; 0 liability on issuers; too many subcontractors; we trained users to click "OK"

#weis2011 great slide on CA success rates (hint: godaddy is #1) #sadtrombone

#weis2011 sample: 1 million web sites; less than 6% do SSL/TLS right. cheap certs == cheap "security"; policies need to change incentives

#weis2011 URL for the data is in the last slide. first question is challenging the approach for the analysis and went on for a while

Dr Greer [cgreer at] is Assistant Director, Information Technology R&D, Office of Science & Technology Policy, The White House

Opening: “The expertise of the attendees is greatly needed.”

He provided a broad overview of the goals & initiatives of the federal government as they relate to domestic & international cybersecurity. Greer went through the responsibilities of various agencies and made it clear that this is a highly distributed effort across all sectors of government.

He emphasized the need for a close partnership with private sector to accomplish these goals and also the criticality of not just coming up with plans but also implementing those plans.

It really was a high-level overview and – as I point out in the twitter transcript – would have been cooler if Dr Greer did a deep-dive on 2-3 items vs do a survey. He did set the tone pretty well – we are in challenging times that are changing rapidly. We’re still fighting the fights of 5-10 years ago but are working to provide a framework for keeping pace with cybercrimminals. The government is “doing stuff”, but it’s all useless without translating thousands of pages of legal mumbo jumbo into practical, actionable activities.

The 10 minute post-talk Q&A was far better than the actual preso.

Twitter transcript:

#weis2011 Obama: "America's economic prosperity in 21st cent will depend on cybersecurity" :: sec begets growth but underscores threats, too

#weis2011 one time we never expected every individual to need an IP address, now even refrigerators have one.

#weis2011 IPv6 need exacerbated by mobile, mobile apps themselves have great benefit, but also introduces new threat vector.

#weis2011 OSTP runs phishing tests 3x year #spiffy

#weis2011 POTUS Strategy: Catalyze brkthrus for natnl priorities, promote mkt-based innov; invest in building blocks of american innovation

#weis2011 policy review (2009) themes: lead frm top;build cap for dig natin;share resp for cybersec;effective info sharing/irp; encrge innov

#weis2011 pimping the International Strategy For Cyberspace release recently

#weis2011 key "norms" in ISC report: upholding fundamental freedoms (esp speech), global interoperability & cybersecurity due diligence

#weis2011 Greer shifting to talking about legis; OSTP has been wrkng to promote good bills esp for natnl data breach rprting & penalties

#weis2011 computer fraud & abuse act is *25 years old*. We need new regulations to help fight 21st century crime < 25 years! yikes! #weis2011 FISMA shifting from compliance-based to proactive protection-based; mentioned EINSTEIN IDS/ISP #wes2011 pimping education & awareness efforts #weis2011 pimping fed trusted ID initiative ; password are $ & failing; multiple accts are real & problematic #weis2011 (pers comment) the audience knows much of what Greer is saying, surprised he's giving such a broad overview vs 2/3 deep dives #weis2011 (pers comment) the efforts for fed cybesec seem waaay to disjoint & distributed to truly be effective. #weis2011 pimping fed trusted ID initiative ; password are $ & failing; multiple accts are real & problematic #weis2011 pimping CSIA, SSG & SCORE < much alphabet soup in fed cybersec…the letters didn't help today #weis2011 results of many research efforts are both near & just over the horizon, but all useless if not put into effective practice #weiss2011 impt to work with priv sector on economics of legis&policy choices (immunity/liability/safe hrbr/incentives/disclosure/audit) #weis2011 need to understand market factors incentivizing hackers (valuation/cost-ben/risk-decision making/criminal markets) #weis2011 (pers comment) another poke at Microsoft when talking about server security. Major hacks of late were linux/apache/solaris. #lame #weis2011 Cyber insurance is a possibility if we can develop good quant-based risk assessment/management frameworks #weis2011 #weis2011 q:"where will cybersec be in 10yrs?" -cyberspace will be more resilient & trustworthy; hardening sys&nets useless w/o educatng ppl #weis2011 by 2021 we will have solved all the cybersecurity issues of 2005 < wise man #weis2011 q:"the US spends > than rest of wrld combined on cybersec but it's still just pennies. will this change?" :: it's in the proposals

One of my subdomains is for mail and I was using an easy DNS hack to point it to my hosted Gmail setup (just create a CNAME pointing to This stopped working for some folks this week and I’ve had no time to debug exactly why so I decided to go back to a simple HTTP 301 redirect to avoid any glitches (for whatever reason) in the future – or, at least ensure the glitches were due to any ineptness on my part. Unfortunately, this created an interesting problem that I had not foreseen.

I started playing with Strict Transport Security (HSTS) a while ago and – for kicks & some enhanced WordPress & Drupal cookie security – moved a couple domains to it. I neglected to actually pay for a cert that would give me wildcard subdomain usage and only put in a couple domains for the cert request. I neglected to put the mail one in and that caused Chrome to not honor the redirect due to the certificate not being valid for the mail domain.

I tweaked theStrict-Transport-Security header setting in my nginx config to not include subdomains, but it seems Chrome had already tucked the entry into (on OS X):

[code padlinenumbers=”false” gutter=”false”]~/Library/Application Support/Google/Chrome/Default/TransportSecurity[/code]

and was ignoring the new expiration and subdomain settings I was now sending. Again, no time to research why as I really just needed to get the mail redirect working. I guessed that removing the entry would be the easiest way to bend Chrome to my will but it turns out that it’s not that simple since the browser seems to hash the host value:

[code]"wA9USN1KVIEHgBTF9j2q0wPLlLieQoLrXKheK9lkgl8=": {
"created": 1300919611.230054,
"expiry": 1303563439.443086,
"include_subdomains": true,
"mode": "strict"

(I have no idea which host that is, btw.)

I ended up backing up the TransportSecurity file and removing all entries from it. Any site I visit that has the cookie will re-establish itself and it cleared up the redirect issue. I still need to get a new certificate, but that can wait for another day.

Windows and Linux folk should be able to find that file pretty easily in their home directories if they are experiencing any similar issue. If you can’t find it, drop a note in the comments and I’ll dig out the locations.