Skip navigation

Category Archives: Vulnerabilities

In the past ~4 weeks I have personally observed some irrefutable things in “AI” that are very likely going to cause massive shocks to employment models in IT, software development, systems administration, and cybersecurity. I know some have already seen minor shocks. They are nothing compared to what’s highly probably ahead.

Nobody likely wants to hear this, but you absolutely need to make or take time this year to identify what you can do that AI cannot do and create some of those items if your list is short or empty.

The weavers in the 1800s used violence to get a 20-year pseudo-reprieve before they were pushed into obsolescence. We’ve got ~maybe 18 months. I’m as pushback-on-this-“AI”-thing as makes sense. I’d like for the bubble to burst. Even if it does, the rulers of our clicktatorship will just fuel a quick rebuild.

Four human-only capabilities in security

In my (broad) field, I think there are some things that make humans 110% necessary. Here’s my list — and it’d be great if folks in very subdomain-specific parts of cyber would provide similar ones. I try to stay in my lane.

1. Judgment under uncertainty with real consequences

These new “AI” systems can use tools to analyze a gazillion sessions and cluster payloads, but they do not (or absolutely should not) bear responsibility for the “we’re pulling the plug on production” decision at 3am. This “weight of consequence” shapes human expertise in ways that inform intuition, risk tolerance, and the ability to act decisively with incomplete information.

Organizations will continue needing people who can own outcomes, not just produce analysis.

2. Adversarial creativity and novel problem framing

The more recent “AI” systems are actually darn good at pattern matching against known patterns and recombining existing approaches. They absolutely suck at the “genuinely novel” — the attack vector nobody has documented, the defensive technique that requires understanding how a specific organization actually operates versus how it should operate.

The best security practitioners think like attackers in ways that go beyond “here are common TTPs.”

3. Institutional knowledge and relationship capital

A yuge one.

Understanding that the finance team always ignores security warnings — especially Dave — during quarter-close. That the legacy SCADA system can’t be patched because the vendor went bankrupt in 2019. That the CISO and CTO have a long-running disagreement about cloud migration.

This context shapes what recommendations are actually actionable. Many technically correct analyses are organizationally useless.

4. The ability to build and maintain trust

The biggest one.

When a breach happens, executives don’t want a report from an “AI”. They want someone who can look them in the eye, explain what happened, and take ownership of the path forward. The human element of security leadership is absolutely not going away.

How to develop these capabilities

Develop depth in areas that require your presence or legal accountability. Disciplines such as incident response, compliance attestation, or security architecture for air-gapped or classified environments. These have regulatory and practical barriers to full automation.

Build expertise in the seams between systems. Understanding how a given combination of legacy mainframe, cloud services, and OT environment actually interconnects requires the kind of institutional archaeology (or the powers of a sexton) that doesn’t exist in training data.

Get comfortable being the human in the loop. I know this will get me tapping mute or block a lot, but you’re going to need to get comfortable being the human in the loop for “AI”-augmented workflows. The analyst who can effectively direct tools, validate outputs (b/c these things will always make stuff up), and translate findings for different audiences has a different job than before but still a necessary one.

Learn to ask better questions. Bring your hypotheses, domain expertise, and knowing which threads are worth pulling to the table. That editorial judgment about what matters is undervalued, and is going to take a while to infuse into “AI” systems.

We’re all John Henry now

A year ago, even with long covid brain fog, I could out-“John Henry” all of the commercial AI models at programming, cyber, and writing tasks. Both in speed and quality.

Now, with the fog gone, I’m likely ~3 months away from being slower than “AI” on a substantial number of core tasks that it can absolutely do. I’ve seen it. I’ve validated the outputs. It sucks. It really really sucks. And it’s not because I’m feeble or have some other undisclosed brain condition (unlike 47). These systems are being curated to do exactly that: erase all of us John Henrys.

The folks who thrive will be those who can figure out what “AI” capabilities aren’t complete garbage and wield them with uniquely human judgment rather than competing on tasks where “AI” has clear advantages.

The pipeline problem

The very uncomfortable truth: there will be fewer entry-level positions that consist primarily of “look at alerts and escalate.” That pipeline into the field is narrowing at a frightening pace.

What concerns me most isn’t the senior practitioners. We’ll adapt and likely become that much more effective. It’s the junior folks who won’t get the years of pattern exposure that built our intuition in the first place.

That’s a pipeline problem the industry hasn’t seriously grappled with yet — and isn’t likely to b/c of the hot, thin air in the offices and boardrooms of myopic and greedy senior executives.

ENISA published docs for their European Vulnerability Database (EUVD) — https://euvd.enisa.europa.eu/apidoc.

I’ve got an easier-on-the-eyes version that supports light/dark mode and includes sample API JSON results at https://rud.is/euvd-api/. The Quarto markdown source for it can be found at https://rud.is/euvd-api/euvd-api.qmd.

I need to make an MCP (Model Context Protocol) server for the API, but not everyone wants an MCP server, so there’s a TypeScript NPM package for it — https://www.npmjs.com/package/@hrbrmstr/euvd (source: https://codeberg.org/hrbrmstr/euvd-ts). This comes with the added benefit of making it easier/cleaner to build an MCP server. Friends don’t let friends make icky Python-based MCP servers.

I also need to integrate it into pipeline stuff at $WORK, so there’s also a Golang API wrapper & CLI @ https://codeberg.org/hrbrmstr/euvd.

READMEs in both repos have all the details.

VulnCheck has some new, free API endpoints for the cybersecurity community.

Two extremely useful ones are for their extended version of CISA’s KEV, and an in-situ replacement for NVD’s sad excuse for an API and soon-to-be-removed JSON feeds.

There are two ways to work with these APIs. One is retrieve a “backup” of the entire dataset as a ZIP file, and the other is to use the API to retrieve individual CVEs from each “index”.

You’ll need a free API key from VulnCheck to use these APIs.

All code shown makes the assumption that you’ve stored your API key in an environment variable named VULNCHECK_API_KEY.

After the curl examples, there’s a section on a small Golang CLI I made to make it easier to get combined extended KEV and NVDv2 CVE information in one CLI call for a given CVE.

Backups

Retrieving the complete dataset is a multi-step process. First you make a call to the specific API endpoint for each index to backup. That returns some JSON with a temporary, AWS pre-signed URL (a method to grant temporary access to files stored in AWS S3) to download the ZIP file. Then you download the ZIP file, and finally you extract the contents of the ZIP file into a directory. The output is different for the NVDv2 and extended KEV indexes, but the core process is the same.

NVDv2

Here’s a curl idiom for the NVDv2 index backup. The result is a directory of uncompressed JSON that’s in the same format as the NVDv2 JSON feeds.

# Grab the temporary AWS pre-signed URL for the NVDv2 index and then download the ZIP file.
curl \
  --silent \
  --output vcnvd2.zip --url "$(
    curl \
      --silent \
      --cookie "token=${VULNCHECK_API_KEY}" \
      --header 'Accept: application/json' \
      --url "https://api.vulncheck.com/v3/backup/nist-nvd2" | jq -r '.data[].url'
    )"

rm -rf ./nvd2

# unzip it
unzip -q -o -d ./nvd2 vcnvd2.zip

# uncompress the JSON files
ls ./nvd2/*gz | xargs gunzip

tree ./nvd2
./nvd2
├── nvdcve-2.0-000.json
├── nvdcve-2.0-001.json
├── nvdcve-2.0-002.json
├── nvdcve-2.0-003.json
├── nvdcve-2.0-004.json
├── nvdcve-2.0-005.json
├── nvdcve-2.0-006.json
├── nvdcve-2.0-007.json
├── nvdcve-2.0-008.json
├── nvdcve-2.0-009.json
├── nvdcve-2.0-010.json
├── nvdcve-2.0-011.json
├── nvdcve-2.0-012.json
├── nvdcve-2.0-013.json
├── nvdcve-2.0-014.json
├── nvdcve-2.0-015.json
├── nvdcve-2.0-016.json
├── nvdcve-2.0-017.json
├── nvdcve-2.0-018.json
├── nvdcve-2.0-019.json
├── nvdcve-2.0-020.json
├── nvdcve-2.0-021.json
├── nvdcve-2.0-022.json
├── nvdcve-2.0-023.json
├── nvdcve-2.0-024.json
├── nvdcve-2.0-025.json
├── nvdcve-2.0-026.json
├── nvdcve-2.0-027.json
├── nvdcve-2.0-028.json
├── nvdcve-2.0-029.json
├── nvdcve-2.0-030.json
├── nvdcve-2.0-031.json
├── nvdcve-2.0-032.json
├── nvdcve-2.0-033.json
├── nvdcve-2.0-034.json
├── nvdcve-2.0-035.json
├── nvdcve-2.0-036.json
├── nvdcve-2.0-037.json
├── nvdcve-2.0-038.json
├── nvdcve-2.0-039.json
├── nvdcve-2.0-040.json
├── nvdcve-2.0-041.json
├── nvdcve-2.0-042.json
├── nvdcve-2.0-043.json
├── nvdcve-2.0-044.json
├── nvdcve-2.0-045.json
├── nvdcve-2.0-046.json
├── nvdcve-2.0-047.json
├── nvdcve-2.0-048.json
├── nvdcve-2.0-049.json
├── nvdcve-2.0-050.json
├── nvdcve-2.0-051.json
├── nvdcve-2.0-052.json
├── nvdcve-2.0-053.json
├── nvdcve-2.0-054.json
├── nvdcve-2.0-055.json
├── nvdcve-2.0-056.json
├── nvdcve-2.0-057.json
├── nvdcve-2.0-058.json
├── nvdcve-2.0-059.json
├── nvdcve-2.0-060.json
├── nvdcve-2.0-061.json
├── nvdcve-2.0-062.json
├── nvdcve-2.0-063.json
├── nvdcve-2.0-064.json
├── nvdcve-2.0-065.json
├── nvdcve-2.0-066.json
├── nvdcve-2.0-067.json
├── nvdcve-2.0-068.json
├── nvdcve-2.0-069.json
├── nvdcve-2.0-070.json
├── nvdcve-2.0-071.json
├── nvdcve-2.0-072.json
├── nvdcve-2.0-073.json
├── nvdcve-2.0-074.json
├── nvdcve-2.0-075.json
├── nvdcve-2.0-076.json
├── nvdcve-2.0-077.json
├── nvdcve-2.0-078.json
├── nvdcve-2.0-079.json
├── nvdcve-2.0-080.json
├── nvdcve-2.0-081.json
├── nvdcve-2.0-082.json
├── nvdcve-2.0-083.json
├── nvdcve-2.0-084.json
├── nvdcve-2.0-085.json
├── nvdcve-2.0-086.json
├── nvdcve-2.0-087.json
├── nvdcve-2.0-088.json
├── nvdcve-2.0-089.json
├── nvdcve-2.0-090.json
├── nvdcve-2.0-091.json
├── nvdcve-2.0-092.json
├── nvdcve-2.0-093.json
├── nvdcve-2.0-094.json
├── nvdcve-2.0-095.json
├── nvdcve-2.0-096.json
├── nvdcve-2.0-097.json
├── nvdcve-2.0-098.json
├── nvdcve-2.0-099.json
├── nvdcve-2.0-100.json
├── nvdcve-2.0-101.json
├── nvdcve-2.0-102.json
├── nvdcve-2.0-103.json
├── nvdcve-2.0-104.json
├── nvdcve-2.0-105.json
├── nvdcve-2.0-106.json
├── nvdcve-2.0-107.json
├── nvdcve-2.0-108.json
├── nvdcve-2.0-109.json
├── nvdcve-2.0-110.json
├── nvdcve-2.0-111.json
├── nvdcve-2.0-112.json
├── nvdcve-2.0-113.json
├── nvdcve-2.0-114.json
├── nvdcve-2.0-115.json
├── nvdcve-2.0-116.json
├── nvdcve-2.0-117.json
├── nvdcve-2.0-118.json
├── nvdcve-2.0-119.json
├── nvdcve-2.0-120.json
└── nvdcve-2.0-121.json

1 directory, 122 files

VulnCheck’s Extended KEV

Here’s a curl idiom for the extended KEV index backup. The result is a directory with a single uncompressed JSON that’s in an extended format of what’s in the CISA KEV JSON.s

# Grab the temporary AWS pre-signed URL for the NVDv2 index and then download the ZIP file.
curl \
  --silent \
  --output vckev.zip --url "$(
    curl \
      --silent \
      --cookie "token=${VULNCHECK_API_KEY}" \
      --header 'Accept: application/json' \
      --url "https://api.vulncheck.com/v3/backup/vulncheck-kev" | jq -r '.data[].url'
    )"

rm -rf ./vckev

# unzip it
unzip -q -o -d ./vckev vckev.zip

tree ./vckev
./vckev
└── vulncheck_known_exploited_vulnerabilities.json

1 directory, 1 file

Retrieving Information On Individual CVEs

While there are other, searchable fields for each index, the primary use case for most of us is getting information on individual CVEs. The API calls are virtually identical, apart from the selected index.

NOTE: the examples pipe the output through jq to make the API results easier to read.

NVDv2

curl \
  --silent \
  --cookie "token=${VULNCHECK_API_KEY}" \
  --header 'Accept: application/json' \
  --url "https://api.vulncheck.com/v3/index/nist-nvd2?cve=CVE-2024-23334" | jq
{
  "_benchmark": 0.056277,
  "_meta": {
    "timestamp": "2024-03-23T08:47:17.940032202Z",
    "index": "nist-nvd2",
    "limit": 100,
    "total_documents": 1,
    "sort": "_id",
    "parameters": [
      {
        "name": "cve",
        "format": "CVE-YYYY-N{4-7}"
      },
      {
        "name": "alias"
      },
      {
        "name": "iava",
        "format": "[0-9]{4}[A-Z-0-9]+"
      },
      {
        "name": "threat_actor"
      },
      {
        "name": "mitre_id"
      },
      {
        "name": "misp_id"
      },
      {
        "name": "ransomware"
      },
      {
        "name": "botnet"
      },
      {
        "name": "published"
      },
      {
        "name": "lastModStartDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "lastModEndDate",
        "format": "YYYY-MM-DD"
      }
    ],
    "order": "desc",
    "page": 1,
    "total_pages": 1,
    "max_pages": 6,
    "first_item": 1,
    "last_item": 1
  },
  "data": [
    {
      "id": "CVE-2024-23334",
      "sourceIdentifier": "security-advisories@github.com",
      "vulnStatus": "Modified",
      "published": "2024-01-29T23:15:08.563",
      "lastModified": "2024-02-09T03:15:09.603",
      "descriptions": [
        {
          "lang": "en",
          "value": "aiohttp is an asynchronous HTTP client/server framework for asyncio and Python. When using aiohttp as a web server and configuring static routes, it is necessary to specify the root path for static files. Additionally, the option 'follow_symlinks' can be used to determine whether to follow symbolic links outside the static root directory. When 'follow_symlinks' is set to True, there is no validation to check if reading a file is within the root directory. This can lead to directory traversal vulnerabilities, resulting in unauthorized access to arbitrary files on the system, even when symlinks are not present.  Disabling follow_symlinks and using a reverse proxy are encouraged mitigations.  Version 3.9.2 fixes this issue."
        },
        {
          "lang": "es",
          "value": "aiohttp es un framework cliente/servidor HTTP asíncrono para asyncio y Python. Cuando se utiliza aiohttp como servidor web y se configuran rutas estáticas, es necesario especificar la ruta raíz para los archivos estáticos. Además, la opción 'follow_symlinks' se puede utilizar para determinar si se deben seguir enlaces simbólicos fuera del directorio raíz estático. Cuando 'follow_symlinks' se establece en Verdadero, no hay validación para verificar si la lectura de un archivo está dentro del directorio raíz. Esto puede generar vulnerabilidades de directory traversal, lo que resulta en acceso no autorizado a archivos arbitrarios en el sistema, incluso cuando no hay enlaces simbólicos presentes. Se recomiendan como mitigaciones deshabilitar follow_symlinks y usar un proxy inverso. La versión 3.9.2 soluciona este problema."
        }
      ],
      "references": [
        {
          "url": "https://github.com/aio-libs/aiohttp/commit/1c335944d6a8b1298baf179b7c0b3069f10c514b",
          "source": "security-advisories@github.com",
          "tags": [
            "Patch"
          ]
        },
        {
          "url": "https://github.com/aio-libs/aiohttp/pull/8079",
          "source": "security-advisories@github.com",
          "tags": [
            "Patch"
          ]
        },
        {
          "url": "https://github.com/aio-libs/aiohttp/security/advisories/GHSA-5h86-8mv2-jq9f",
          "source": "security-advisories@github.com",
          "tags": [
            "Exploit",
            "Mitigation",
            "Vendor Advisory"
          ]
        },
        {
          "url": "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/ICUOCFGTB25WUT336BZ4UNYLSZOUVKBD/",
          "source": "security-advisories@github.com"
        },
        {
          "url": "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/XXWVZIVAYWEBHNRIILZVB3R3SDQNNAA7/",
          "source": "security-advisories@github.com",
          "tags": [
            "Mailing List"
          ]
        }
      ],
      "metrics": {
        "cvssMetricV31": [
          {
            "source": "nvd@nist.gov",
            "type": "Primary",
            "cvssData": {
              "version": "3.1",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
              "attackVector": "NETWORK",
              "attackComplexity": "LOW",
              "privilegesRequired": "NONE",
              "userInteraction": "NONE",
              "scope": "UNCHANGED",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "NONE",
              "availabilityImpact": "NONE",
              "baseScore": 7.5,
              "baseSeverity": "HIGH"
            },
            "exploitabilityScore": 3.9,
            "impactScore": 3.6
          },
          {
            "source": "security-advisories@github.com",
            "type": "Secondary",
            "cvssData": {
              "version": "3.1",
              "vectorString": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N",
              "attackVector": "NETWORK",
              "attackComplexity": "HIGH",
              "privilegesRequired": "NONE",
              "userInteraction": "NONE",
              "scope": "UNCHANGED",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "NONE",
              "availabilityImpact": "NONE",
              "baseScore": 5.9,
              "baseSeverity": "MEDIUM"
            },
            "exploitabilityScore": 2.2,
            "impactScore": 3.6
          }
        ]
      },
      "weaknesses": [
        {
          "source": "security-advisories@github.com",
          "type": "Primary",
          "description": [
            {
              "lang": "en",
              "value": "CWE-22"
            }
          ]
        }
      ],
      "configurations": [
        {
          "nodes": [
            {
              "operator": "OR",
              "cpeMatch": [
                {
                  "vulnerable": true,
                  "criteria": "cpe:2.3:a:aiohttp:aiohttp:*:*:*:*:*:*:*:*",
                  "versionStartIncluding": "1.0.5",
                  "versionEndExcluding": "3.9.2",
                  "matchCriteriaId": "CC18B2A9-9D80-4A6E-94E7-8FC010D8FC70"
                }
              ]
            }
          ]
        },
        {
          "nodes": [
            {
              "operator": "OR",
              "cpeMatch": [
                {
                  "vulnerable": true,
                  "criteria": "cpe:2.3:o:fedoraproject:fedora:39:*:*:*:*:*:*:*",
                  "matchCriteriaId": "B8EDB836-4E6A-4B71-B9B2-AA3E03E0F646"
                }
              ]
            }
          ]
        }
      ],
      "_timestamp": "2024-02-09T05:33:33.170054Z"
    }
  ]
}

VulnCheck’s Extended KEV

curl \
  --silent \
  --cookie "token=${VULNCHECK_API_KEY}" \
  --header 'Accept: application/json' \
  --url "https://api.vulncheck.com/v3/index/vulncheck-kev?cve=CVE-2024-23334" | jq
{
  "_benchmark": 0.328855,
  "_meta": {
    "timestamp": "2024-03-23T08:47:41.025967418Z",
    "index": "vulncheck-kev",
    "limit": 100,
    "total_documents": 1,
    "sort": "_id",
    "parameters": [
      {
        "name": "cve",
        "format": "CVE-YYYY-N{4-7}"
      },
      {
        "name": "alias"
      },
      {
        "name": "iava",
        "format": "[0-9]{4}[A-Z-0-9]+"
      },
      {
        "name": "threat_actor"
      },
      {
        "name": "mitre_id"
      },
      {
        "name": "misp_id"
      },
      {
        "name": "ransomware"
      },
      {
        "name": "botnet"
      },
      {
        "name": "published"
      },
      {
        "name": "lastModStartDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "lastModEndDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "pubStartDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "pubEndDate",
        "format": "YYYY-MM-DD"
      }
    ],
    "order": "desc",
    "page": 1,
    "total_pages": 1,
    "max_pages": 6,
    "first_item": 1,
    "last_item": 1
  },
  "data": [
    {
      "vendorProject": "aiohttp",
      "product": "aiohttp",
      "shortDescription": "aiohttp is an asynchronous HTTP client/server framework for asyncio and Python. When using aiohttp as a web server and configuring static routes, it is necessary to specify the root path for static files. Additionally, the option 'follow_symlinks' can be used to determine whether to follow symbolic links outside the static root directory. When 'follow_symlinks' is set to True, there is no validation to check if reading a file is within the root directory. This can lead to directory traversal vulnerabilities, resulting in unauthorized access to arbitrary files on the system, even when symlinks are not present.  Disabling follow_symlinks and using a reverse proxy are encouraged mitigations.  Version 3.9.2 fixes this issue.",
      "vulnerabilityName": "aiohttp aiohttp Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')",
      "required_action": "Apply remediations or mitigations per vendor instructions or discontinue use of the product if remediation or mitigations are unavailable.",
      "knownRansomwareCampaignUse": "Known",
      "cve": [
        "CVE-2024-23334"
      ],
      "vulncheck_xdb": [
        {
          "xdb_id": "231b48941355",
          "xdb_url": "https://vulncheck.com/xdb/231b48941355",
          "date_added": "2024-02-28T22:30:21Z",
          "exploit_type": "infoleak",
          "clone_ssh_url": "git@github.com:ox1111/CVE-2024-23334.git"
        },
        {
          "xdb_id": "f1d001911304",
          "xdb_url": "https://vulncheck.com/xdb/f1d001911304",
          "date_added": "2024-03-19T16:28:56Z",
          "exploit_type": "infoleak",
          "clone_ssh_url": "git@github.com:jhonnybonny/CVE-2024-23334.git"
        }
      ],
      "vulncheck_reported_exploitation": [
        {
          "url": "https://cyble.com/blog/cgsi-probes-shadowsyndicate-groups-possible-exploitation-of-aiohttp-vulnerability-cve-2024-23334/",
          "date_added": "2024-03-15T00:00:00Z"
        }
      ],
      "date_added": "2024-03-15T00:00:00Z",
      "_timestamp": "2024-03-23T08:27:47.861266Z"
    }
  ]
}

vccve

There’s a project on Codeberg that has code and binaries for macOS, Linux, and Windows for a small CLI that gets you combined extended KEV and NVDv2 information all in one call.

The project README has examples and installation instructions.

Folks may debate the merits of the SHODAN tool, but in my opinion it’s a valuable resource, especially if used for “good”. What is SHODAN? I think ThreatPost summed it up nicely:

“Shodan is a Web based search engine that discovers Internet facing computers, including desktops, servers and routers. The engine, created by programmer John Matherly, allows users to filter searches for systems running a specific type of application (say, Apache Web servers or FTP) and filter results by geographic region. The search engine indexes host ’banners,’ which include meta-data sent between a server and client and includes information such as the type of software run, what services are available and so on.”

I’m in R quite a bit these days and thought it would be useful to have access to the SHODAN API in R. I have a very rudimentary version of the API (search only) up on github which can be integrated into your R environment thus:

library(devtools)
install_github("Rshodan","hrbrmstr")
library(shodan)
help(shodan) # you don't really need to do this cmd

It’ll eventually be in CRAN, but I have some cleanup work to do before the maintainers will accept the submission. If you are new to R, there are a slew of dependencies you’ll need to add to the base R installation. Here’s a good description of how to do that on pretty much every platform.

After I tweeted the above reference, @shawnmer asked the following:

https://twitter.com/shawnmer/status/290904140782137344

That is not an unreasonable request, especially if one is new to R (or SHODAN). I had been working on this post and a more extended example and finally able to get enough code done to warrant publishing it. You can do far more in R than these simple charts & graphs. Imagine taking data from multiple searches–either across time or across ports–and doing a statistical comparison. Or, use some the image processing & recognition libraries within R as well as a package such as RCurl to fetch images from open webcams and attempt to identify people or objects. The following should be enough for most folks to get started.

You can cut/paste the source code here or download the whole source file.

The fundamental shortcut this library provides over just trying to code it yourself is taking the JSON response from SHODAN and turning it into an R data frame. That is not as overtly trivial as you might think and you may want to look at the source code for the library to see where I grabbed some of that code from. I’m also not 100% convinced it’s going to work under all circumstances (hence part of the 0.1 status).

library(shodan)
library(ggplot2)
library(xtable)
library(maps)
library(rworldmap)
library(ggthemes)
 
 
# if you're behind a proxy, setting this will help
# but it's strongly suggested/encouraged that you stick the values in a file and 
# read them in vs paste them in a script
# options(RCurlOptions = list(proxy="proxyhost", proxyuserpwd="user:pass"))
 
setSHODANKey("~/.shodankey")
 
# query example taken from Michael “theprez98” Schearer's DEFCON 18 presentation
# https://www.defcon.org/images/defcon-18/dc-18-presentations/Schearer/DEFCON-18-Schearer-SHODAN.pdf
 
# find all Cisco IOS devies that may have an unauthenticated admin login
# setting trace to be TRUE to see the progress of the query
result = SHODANQuery(query="cisco last-modified www-authenticate",trace=TRUE)
 
#find the first 100 found memcached instances
#result = SHODANQuery(query='port:11211',limit=100,trace=TRUE)
 
df = result$matches
 
# aggregate result by operating system
# you can use this one if you want to filter out NA's completely
#df.summary.by.os = ddply(df, .(os), summarise, N=sum(as.numeric(factor(os))))
#this one provides count of NA's (i.e. unidentified systems)
df.summary.by.os = ddply(df, .(os), summarise, N=length(os))
 
# sort & see the results in a text table
df.summary.by.os = transform(df.summary.by.os, os = reorder(os, -N))
df.summary.by.os

That will yield:

FALSE                 os   N
FALSE 1      Linux 2.4.x  60
FALSE 2      Linux 2.6.x   6
FALSE 3 Linux recent 2.4   2
FALSE 4     Windows 2000   2
FALSE 5   Windows 7 or 8  10
FALSE 6       Windows XP   8
FALSE 7             <NA> 112

You can plot it with:

# plot a bar chart of them
(ggplot(df.summary.by.os,aes(x=os,y=N,fill=os)) + 
   geom_bar(stat="identity") + 
   theme_few() +
   labs(y="Count",title="SHODAN Search Results by OS"))

to yield:

png

and:

world = map_data("world")
(ggplot() +
   geom_polygon(data=world, aes(x=long, y=lat, group=group)) +
   geom_point(data=df, aes(x=longitude, y=latitude), colour="#EE760033",size=1.75) +
   labs(x="",y="") +
   theme_few())

png-1

You can easily do the same by country:

# sort & view the results by country
# see above if you don't want to filter out NA's
df.summary.by.country_code = ddply(df, .(country_code, country_name), summarise, N=sum(!is.na(country_code)))
df.summary.by.country_code = transform(df.summary.by.country_code, country_code = reorder(country_code, -N))
 
df.summary.by.country_code
##    country_code              country_name  N
## 1            AR                 Argentina  2
## 2            AT                   Austria  2
## 3            AU                 Australia  2
## 4            BE                   Belgium  2
## 5            BN         Brunei Darussalam  2
## 6            BR                    Brazil 14
## 7            CA                    Canada 16
## 8            CN                     China  6
## 9            CO                  Colombia  4
## 10           CZ            Czech Republic  2
## 11           DE                   Germany 12
## 12           EE                   Estonia  4
## 13           ES                     Spain  4
## 14           FR                    France 10
## 15           HK                 Hong Kong  2
## 16           HU                   Hungary  2
## 17           IN                     India 10
## 18           IR Iran, Islamic Republic of  4
## 19           IT                     Italy  4
## 20           LV                    Latvia  4
## 21           MX                    Mexico  2
## 22           PK                  Pakistan  4
## 23           PL                    Poland 16
## 24           RU        Russian Federation 14
## 25           SG                 Singapore  2
## 26           SK                  Slovakia  2
## 27           TW                    Taiwan  6
## 28           UA                   Ukraine  2
## 29           US             United States 28
## 30           VE                 Venezuela  2
## 31         <NA>                      <NA>  0

(ggplot(df.summary.by.country_code,aes(x=country_code,y=N)) + 
  geom_bar(stat="identity") +
  theme_few() +
  labs(y="Count",x="Country",title="SHODAN Search Results by Country"))

png-2

And, easily generate the must-have choropleth:

# except make a choropleth
# using the very simple rworldmap process
shodanChoropleth = joinCountryData2Map( df.summary.by.country_code, joinCode = "ISO2", nameJoinColumn = "country_code")
par(mai=c(0,0,0.2,0),xaxs="i",yaxs="i")
mapCountryData(shodanChoropleth, nameColumnToPlot="N",colourPalette="terrain",catMethod="fixedWidth")

png-3

Again, producing pretty pictures is all well-and-good, but it’s best to start with some good questions you need answering to make any visualization worthwhile. In the coming weeks, I’ll do some posts that show what types of questions you may want to ask/answer with R & SHODAN.

I encourage folks that have issues, concerns or requests to use github vs post in the comments, but I’ll try to respond to either as quickly as possible.

I’m not sure why I never did this earlier, but a post on LifeHacker gave me an idea to add location bar quick search of CVEs (Common Vulnerabilities and Exposures), no doubt due to their example on searching LifeHacker for “security”.

My two favorite sites for searching CVE specifics are, at present, Risk IO’s and CVE Details.

I’m fairly certain anyone in security reading this can figure out the rest, but as I’m ever a slave to minutiae, here are the two shortcuts I’ve setup in Chrome:

Title: CVE Details
Search URL: http://cvedetails.com/cve-details.php?cve_id=%s
Shortcut: cved
Title: Risk I/O Vulnerability Search
Search URL: https://db.risk.io/?q=%s
Shortcut: cvedb

Here’s what the location bar changes to when I use cvedb to search for 2012‑4774

Screenshot_12_28_12_8_58_PM

In reality, this is only saving a scroll and a click since entering CVE‑2012‑4774 into an unoptimized location bar would have just searched Google and given me most of the usual suspects in the first few links. Still, it saves some time and immediately gets me the vulnerability data from the sites I prefer.

I may start poking to see what other security-related searches I can setup in the location bar.

I’m on a “three things” motif for 2012, as it’s really difficult for most folks to focus on more than three core elements well. This is especially true for web developers as they have so much to contend with on a daily basis, whether it be new features, bug reports, user help requests or just ensuring proper caffeine levels are maintained.

In 2011, web sites took more hits then they ever have and—sadly—most attacks could have been prevented. I fear that the pastings will continue in 2012, but there are some steps you can take to help make your site less of a target.

Bookmark & Use OWASP’s Web Site Regularly

I’d feel a little sorry for hacked web sites if it weren’t for resources like OWASP, tools like IronBee and principles like Rugged being in abundance, with many smart folks associated with them being more than willing to offer counsel and advice.

If you run a web site or develop web applications and have not inhaled all the information OWASP has to provide, then you are engaging in the Internet equivalent of driving a Ford Pinto (the exploding kind) without seat belts, airbags, doors and a working dashboard console. There is so much good information and advice out there with solid examples that prove some truly effective security measures can really be implemented in a single line of code.

Make it a point to read, re-read and keep-up-to-date on new articles and resources that OWASP provides. I know you also need to beat the competition to new features and crank out “x” lines of code per day, but you also need to do what it takes to avoid joining the ranks of those in DataLossDB.

Patch & Properly Configure Your Bootstrap Components

Your web app uses frameworks, runs in some type of web container and sits on top of an operating system. Unfortunately, vulnerabilities pop up in each of those components from time to time and you need to keep on top of those and determine which ones you will patch and when. Sites like Secunia and US-CERT aggregate patch information pretty well for operating systems and popular server software components, but it’s best to also subscribe to release and security mailing lists for your frameworks and other bootstrap components.

Configuring your bootstrap environment securely is also important and you can use handy guides over at the Center for Internet Security and the National Vulnerability Database (which is also good for vulnerability reports). The good news is that you probably only need to double-check this a couple times a year and can also integreate secure configuration baselines into tools like Chef & Puppet.

Secure Data Appropriately

I won’t belabor this point (especially if you promise to read the OWASP guidance on this thoroughly) but you need to look at the data being stored and how it is accessed and determine the most appropriate way to secure it. Don’t store more than you absolutely need to. Encrypt password fields (and other sensitive data) with more than a plain MD5 hash. Don’t store any credit card numbers (really, just don’t) or tokenize them if you do (but you really don’t). Keep data off the front-end environment and watch the database and application logs with a service like Loggly (to see if there’s anything fishy going on).

I’m going to cheat and close with a fourth resolution for you: Create (and test) a data breach response plan. If any security professional is being honest, it’s virtually impossible to prevent a breach if a hacker is determined enough and the best thing you can do for your user base is to respond well when it happens. The only way to do that is have a plan and to test it (so you know what you are doing when the breach occurs). And, you should run your communications plan by other folks to make sure it’s adequate (ping @securitytwits for suggestions for good resources).

You want to be able to walk away from a breach with your reputation as intact as possible (so you’ll have to keep the other three resolutions anyway) with your users feeling fully informed and assured that you did everything you could to prevent it.

What other security-related resolutions are you making this year as a web developer or web site owner and what other tools/services are you using to secure your sites?

NOTE: This is a re-post from a topic I started on the SecurityMetrics & SIRA mailing lists. Wanted to broaden the discussion to anyone not on those (and, why aren’t you on them?)

I had not heard the term micromort prior to listening to David Spiegelhalter’s Do Lecture and the concept of it really stuck in my (albeit thick) head all week.

I didn’t grab the paper yet, but the abstract for “Microrisks for Medical Decision Analysis” seems to be able to extrapolate directly to the risks we face in infosec:

“Many would agree on the need to inform patients about the risks of medical conditions or treatments and to consider those risks in making medical decisions. The question is how to describe the risks and how to balance them with other factors in arriving at a decision. In this article, we present the thesis that part of the answer lies in defining an appropriate scale for risks that are often quite small. We propose that a convenient unit in which to measure most medical risks is the microprobability, a probability of 1 in 1 million. When the risk consequence is death, we can define a micromort as one microprobability of death. Medical risks can be placed in perspective by noting that we live in a society where people face about 270 micromorts per year from interactions with motor vehicles.

Continuing risks or hazards, such as are posed by following unhealthful practices or by the side-effects of drugs, can be described in the same micromort framework. If the consequence is not death, but some other serious consequence like blindness or amputation, the microrisk structure can be used to characterize the probability of disability.

Once the risks are described in the microrisk form, they can be evaluated in terms of the patient’s willingness-to-pay to avoid them. The suggested procedure is illustrated in the case of a woman facing a cranial arteriogram of a suspected arterio-venous malformation. Generic curves allow such analyses to be performed approximately in terms of the patient’s sex, age, and economic situation. More detailed analyses can be performed if desired.

Microrisk analysis is based on the proposition that precision in language permits the soundness of thought that produces clarity of action and peace of mind.”

When my CC is handy and I feel like giving up some privacy I’ll grab the whole paper, but the correlations seem pretty clear from just that bit.

I must have missed Schneier’s blog post about it earlier this month where he links to understandinguncertainty.org/micromorts which links to plus.maths.org/content/os/issue55/features/risk/index (apologies for the link leapfrogging, but it provides background context that I did not have prior).

At a risk to my credibility, I’ll add another link to a Wikipedia article that lists some actual micromorts and include a small sample here:

Risks that increase the annual death risk by one micromort, and their associated cause of death:

  • smoking 1.4 cigarettes (cancer, heart disease)
  • drinking 0.5 liter of wine (cirrhosis of the liver)
  • spending 1 hour in a coal mine (black lung disease)
  • spending 3 hours in a coal mine (accident)
  • living 2 days in New York or Boston (air pollution)

I asked on Twitter if anyone thought we had an equivalent – a “micropwn“, say – for our discipline. Do we have enough high level data to produce a generic micropwn for something like:

  • 1 micropwn for every 3 consecutive days of missed DAT updates
  • 1 micropwn for every 10 Windows desktops with users with local Administrator privileges
  • 1 micropwn for every 5 consecutive days of missed IDS/IDP signature updates

Just like with the medical side of things, the micropwn calculation can be increased depending on the level of detail. For example (these are all made up for medicine):

  • 1 micromort for smoking 0.5 cigarettes if you are an overweight man in his 50’s
  • 1 micromort for smoking 0.25 cigarettes if you are an overwight man in his 50’s with a family genetic history of lung cancer

(again, I don’t have the paper, but the abstract seems to suggest this is how medical micromorts work)

Similarly, the micropwn calculation could get more granular by factoring in type of industry, geographic locations, breach histiory, etc.

Also, a micropwn (just like micromort) doesn’t necessarily mean “catastrophic” breach (I dislike that word as I think of it as a broad term when most folks associate it directly with sensitive record loss). Could mean successful malware infection in my view.

So, to further refine the question I originally posed on Twitter: Do we have enough broad data to provide input for micropwn calculations and can we define a starter-list of micropwns that would prove valuable in helping articulate risk within and outside our discipline?

UPDATE – 2011-02-26: Alphonso has posted his slides and BeeWise is open!

Speaker: Alfonso De Gregorio

How do we build a future in software security?

 

/me: the slides that will be posted have a ton of detail that Alfonso sped through. you’ll get a very good feel from them

 

Metrics are the servants of risk management and RM is about making decisions

we have incomplete information about # & severity of vulns

software products are highly defective and have no accountability

 

Bugs & Carrots

discussion around what software vendors are incented to do/why

features > security

bug fix > vuln fix

time to market > test/verify

 

M&Ms

(Markets & Metrics)

we need to put a cost on the software flaws with laws/regs & change in liability models

create feedback mechanisms (/me: open group work on security architecture?)

 

investment metrics to-date have challenges, especially in severity and probability of events

market-based metrics would provide a different context (e.g. stock market pricing)

create an infosec security market?

  • bug challenges
  • auctions
  • vuln brokers
  • infosec insurance
  • exploit derivatives

 

info function / incentive function / risk balancing function efficiency – all factors in creating a vulnerability market

/me: make a table with bullets above as rows and factors list as columns to do a comparison

suggests an Exploit Derivatives market (future’s contracts for vulns)

[side-talk: discussion about derviatives vs future and how the profit incentives may be conflicting]

[side-talk: why will make software companies pay attention to what seems to be a market that only makes speculators rich?]

[side-talk: is this legal? can we get this baked into contracts?]

[side-talk: degraded convo down to responsibility of software companies]

[side-talk: interesting analogy to the airline industry needing to be in the oil futures market to software companies needing to be in this potential vuln/exploit market]

another example is weather derivatives

 

cites two examples of how prediction markets can incent change

cites tradesports.com  and a FIFA predction market