Automating Compliance Evidence Screenshots

03 April 2026
automation,
compliance,
python

8 min read

If you've ever worked in a company that goes through SOC2, ISO27001, or similar audits, you know the pain of collecting evidence. Screenshots of admin consoles, security configurations, access controls, all captured manually, one by one, every single time an auditor or a customer asks for them. This was a perfect fit for automation, so I built a tool to handle the whole process.

Table of Contents
The Problem
The Solution
The Code

The Problem

Audits and Customer Requests

Anyone working in security or compliance, or certain other functions at a company of decent size will be familiar with evidence collection. Whether it's for a SOC2 audit, ISO27001 certification, or a customer security questionnaire, at some point someone is going to ask you to prove that your security controls are actually in place. And "prove" usually means screenshots.

The requests are always something like: "can you show us that MFA is enforced?", "can you prove encryption at rest is enabled?", "show us who has admin access to production." Standard stuff. And the answer is almost always a screenshot of a console page showing the configuration.

Why Screenshots?

You might wonder why screenshots are still the go-to evidence format in 2026. Sure, you could automate evidence collection through APIs or terminal commands, and that would be easier to script, but the output isn't nearly as audit-friendly. A screenshot of your Okta admin console showing MFA policies is way easier present than a JSON blob or a terminal response. Auditors and customers want visual proof that is easy to review without needing access to your systems, and at the end of the day you want to pass that audit or satisfy whatever the customer is asking, right? Screenshots are also easy to include in reports and presentations, and they show the state of a system at a specific point in time.

The Manual Pain

Here's what the process used to look like: open a browser, log into the console (with SSO and MFA of course), navigate to the right page, take a screenshot, save it with a meaningful name, repeat for the next page. Now do that for Okta, AWS, GitLab, SharePoint, and whatever else you have. Some of these consoles have long scrollable pages, so you need to either scroll and stitch multiple screenshots together or use a browser extension like FireShot to capture the full page. Then there's the date problem: auditors want to see when the screenshot was taken, which means you need a full screen capture (not just the browser viewport) to include the date from your OS taskbar or menu bar. That doesn't play well with full-page browser screenshots where the page is longer than your screen. You might also need to redact certain parts of the screenshots before sharing them (eg. sensitive data that the auditor doesn't need to see), which adds yet another manual step. I haven't tackled automated redaction yet, but it's on the list.

Then do it again next quarter, or whenever a customer asks. For a single system with a few pages this is manageable, but when you have 6+ systems and 20+ evidence items to capture, it easily takes an hour or more of tedious manual work every time. And since this needs to happen repeatedly, it's the kind of task that is begging to be automated.

The Solution

I wrote a Python tool that uses Playwright to automate the whole evidence capture process. The idea is straightforward: define all the pages you need to screenshot in a YAML config file, run the script, log into your systems once, and let it capture everything for you.

How it Works

The workflow is split in two phases. In the first phase, the tool opens browser windows for all defined systems simultaneously and waits for you to complete SSO login in each one. This is the only manual step since there's no clean way to automate SSO with MFA (and you wouldn't want to, security-wise). Once you've logged in everywhere, you press a single button and the second phase kicks in: the tool goes through all configured pages in parallel, taking full-page screenshots automatically.

After the first version of the script I improved the approach quite a bit. It now opens all systems at once for login (so you're not waiting one by one), captures pages in configurable batches per system using multiple tabs, and runs all systems in parallel. It also supports importing cookies from Firefox HAR exports if you want to skip manual login entirely, with auto-detection of which system each HAR file belongs to based on the URLs inside it. You just drop all the exported HAR files in a folder, click Import All, and the tool figures out the rest, setting up sessions for each system and showing you a status list of which ones are ready. If you enable auto-import in the settings, it even picks up new HAR files on launch without you having to click anything.

For full-page screenshots the tool does some DOM manipulation to expand scrollable containers that are common in admin UIs (eg. Okta's admin panels love to put content in inner scrollable divs). It also neutralizes sticky headers and fixed-position elements that would otherwise float over the content and mess up the screenshot. If a capture fails (eg. the page didn't load in time), it automatically retries a configurable number of times before giving up.

The screenshots below are (censored) example of some common systems and evidences for audits or other companies. You'll get the idea though.

Configuration

Everything is driven by a YAML config file. You define your systems, their login URLs, and the evidence items you want to capture. Each evidence item has a URL, a file name, a description, and optional settings like custom viewport size, wait conditions, or pre-capture interactions.

The pre-capture interactions are quite useful for admin consoles that need some setup before you can screenshot them. For example, you might need to type in a search field to filter results, click a button to expand a section, or repeatedly click a "Show More" button to load all content. All of this is configurable per evidence item.

You can also group systems into profiles (eg. "security", "devops", "public") so you don't have to capture everything every time. If you only need the public-facing evidence for a customer questionnaire, you just select that profile. There's also a category filter that lets you capture only evidence tagged with specific categories (eg. just "Encryption" items across all systems).

The GUI

While the CLI works fine, I also added a GUI because people from GRC or compliance are typically less tech savvy and aren't going to open a terminal. The GUI makes it easier for them not just to run captures, but also to add new evidence items and maintain the configuration going forward. It has four tabs: one for running captures (with checkboxes to select which systems and individual evidence items to include), one for editing the configuration visually, one for applying timestamps, and one showing the log output.

The log tab is particularly useful as it shows real-time output during capture and has a "Continue" button that replaces the terminal's Enter key prompt for confirming logins are complete.

Timestamping

For audit evidence, it's important to show when a screenshot was taken. The tool generates a manifest.json that records the exact capture timestamp for each screenshot, and a companion script can overlay that timestamp directly on the images. The timestamp overlay is configurable (position, colors, opacity, font size) and gets placed in a separate timestamped/ folder so the originals stay untouched.

This is handy because auditors often want to know exactly when evidence was collected, and having the timestamp baked into the image makes it immediately visible without needing to cross-reference metadata. It also solves the full-screen screenshot problem I mentioned earlier, you don't need to capture your OS taskbar to show the date.

Output Structure

The output is organized into date-stamped folders, with subfolders per system. Each run produces a manifest.json with metadata about successful captures, and a run_log.json with the full log including any failures. The structure looks something like this:

evidence/
  2026-04-01/
    okta/
      okta-mfa-policy.png
      okta-admin-users.png
    aws-prod/
      guardduty-findings.png
      encryption-settings.png
    manifest.json
    run_log.json
    timestamped/
      okta/
        okta-mfa-policy.png
        okta-admin-users.png

The manifest is useful on its own as a machine-readable record of what was captured, when, and from which URL. If the evidence ever gets challenged, you have a clear trail.

The Code

The full source code is available on GitHub. You can adapt it to your own systems pretty easily and if something isn't supported, you can throw it to Claude and ask it to edit as you wish. The config file is where you'll spend most of your time, defining your own targets, and the README covers all the available options.

The main dependencies are Playwright for browser automation, PyYAML for configuration, and Pillow for timestamping. Setup is just a pip install and a playwright install for the browser binaries. You can also compile it into an executable if you wish.

If you're dealing with similar compliance evidence collection, feel free to grab it and adjust the config to your needs. It's saved me a lot of repetitive work and now evidence collection takes a few minutes instead of an hour.

← Previous
Career Advice for Every Level