From Spooky Ambitions to Practical Lessons: Overwhelming Animatronics Powered by Local VLM

Animatronics Powered by Local VLM

The dream was simple enough: an AI-powered Halloween skeleton, affectionately dubbed “Skelly,” greeting trick-or-treaters with personalized welcomes based on their costumes. The reality, as often happens in the world of rapid prototyping and ambitious side projects, proved… more complicated. This post details the lessons learned from a somewhat chaotic Halloween night deployment, focusing on the implications inherent in edge AI systems like Skelly, and outlining strategies for a more controlled – and successful – iteration next year. We’ll dive into the design choices, the unexpected challenges, and how leveraging local Vision Language Models (VLMs) can be a powerful tool for privacy-focused applications.

The Initial Vision: A Local AI Halloween Greeter

The core concept revolved around using a Radxa Zero 3W, a connected USB webcam, built-in speaker controlled by a MAX98357A mono amplifier, and the animatronics of a pre-built Halloween skeleton. The plan was to capture images, feed them into an offline VLM like those available through LM Studio (powered by AMD Strix Halo platform), analyze the costumes (with Google gemma-3-27B), and generate a custom greeting delivered via text-to-speech (TTS) using PiperTTS. The original inspiration came from Alex Volkov’s work on Weights & Biases, utilizing a similar setup with Google AI Studio, ElevenLabs, Cartesia, and ChatGPT.

I opted for a fully offline approach to prioritize privacy. Capturing images that include children requires careful consideration, and sending that data to external APIs introduces significant risks. Local processing eliminates those concerns, albeit at the cost of increased complexity in model management and resource requirements.

The Halloween Night Reality: Overwhelmed by the Que

The biggest issue wasn’t technical – it was human. We anticipated a trickle of small groups, perhaps one to three treaters approaching Skelly at a time, uttering a polite “trick or treat.” Instead, we were met with waves of ten-plus children lining up like attendees at a concert. The system simply couldn’t handle the rapid influx.

The manual trigger approach – snapping pictures on demand – quickly became unsustainable. We struggled to process images fast enough before the next wave arrived. Privacy concerns also escalated as we attempted manual intervention, leading us to abandon the effort and join our kids in traditional trick-or-treating. The lack of good reproducible artifacts was a direct consequence of these issues; we were too busy firefighting to collect meaningful data.

Security Considerations: A Deep Dive into Edge AI Risks

This experience highlighted several critical risk considerations for edge AI deployments, particularly those involving physical interaction and potentially sensitive data like images of children:

  • Data Capture & Storage: Even with offline processing, the captured images represent a potential privacy breach if compromised. Secure storage is paramount – encryption at rest and in transit (even locally) is essential. Consider minimizing image retention time or implementing automated deletion policies.
  • Model Integrity: The VLM itself could be targeted. A malicious actor gaining access to the system could potentially replace the model with one that generates inappropriate responses or exfiltrates data. Model signing and verification are crucial.
  • GPIO Control & Physical Access: The Radxa Zero 3W’s GPIO pins, controlling the animatronics, represent a physical attack vector. Unrestricted access to these pins or the network could allow an attacker to manipulate Skelly in unintended ways,
  • Network Exposure (Even Offline): While we aimed for complete offline operation, the system still had network connectivity for initial model downloads and updates. This creates a potential entry point for attackers.

Reimagining Skelly: Controlling the Chaos

Next year’s iteration will focus on mitigating these risks through a combination of controlled interactions, robust security measures, and optimized processing. Here’s the plan:

1. Photo Booth Mode: Abandoning the “ambush” approach in favor of a dedicated photo booth setup. A backdrop and clear visual cues will encourage people to interact with Skelly in a more predictable manner.

2. Motion-Triggered Capture: Replacing voice activation with a motion sensor. This provides a consistent trigger mechanism, allowing us to time image capture and processing effectively.

3. Timing & Rate Limiting: Implementing strict timing controls to prevent overwhelming the system. A delay between captures will allow sufficient time for processing and response generation.

4. Visual Indicators & Auditory Cues: Providing clear feedback to users – a flashing light indicating image capture, a cheerful phrase confirming costume recognition, and a countdown timer before the greeting is delivered. This enhances user experience and encourages cooperation.

5. Enhanced GPIO Controls: Restricting access to the GPIO pins using Linux capabilities or mount namespaces. As well as limiting physical access to Skelly is key to reduce tampering.

Leveraging Local VLMs: A Python Example

The power of local VLMs lies in their ability to understand images without relying on external APIs. Here’s a simplified example demonstrating how to capture an image from a USB webcam and prompt Ollama with a costume greeting request using Python:

import cv2
import requests
import json

# Configuration
OLLAMA_API_URL = "http://localhost:11434/api/generate" # Adjust if necessary
MODEL = "gemma-3-27B"  # Or your preferred VLM model
PROMPT_TEMPLATE = "You are an AI assistant controlling a Halloween animatronic. The following is a base64‑encoded JPEG image of a person(s) in a costume.
Identify the costume in one short phrase and then respond with a friendly greeting that references the costume. Use a cheerful tone."

def capture_image(camera_index=0):
    """Captures an image from the specified webcam."""
    cap = cv2.VideoCapture(camera_index)
    if not cap.isOpened():
        raise IOError("Cannot open webcam")
    ret, frame = cap.read()
    if not ret:
        raise IOError("Failed to capture image")
    _, img_encoded = cv2.imencode('.jpg', frame)
    cap.release()
    return img_encoded.tobytes()

def prompt_ollama(image_data):
    """Prompts Ollama with the image data and returns the response."""
    headers = {
        "Content-Type": "application/json"
    }
    payload = {
        "model": MODEL,
        "prompt": PROMPT_TEMPLATE,
        "stream": False # Set to True for streaming responses
    }

    # Encode the image as base64 (Ollama requires this)
    import base64
    image_base64 = base64.b64encode(image_data).decode('utf-8')
    payload["prompt"] += f"\n[Image: {image_base64}]"

    response = requests.post(OLLAMA_API_URL, headers=headers, data=json.dumps(payload))
    response.raise_for_status()  # Raise an exception for bad status codes
    return response.json()['response']


if __name__ == "__main__":
    try:
        image_data = capture_image()
        greeting = prompt_ollama(image_data)
        print("Generated Greeting:", greeting)

    except Exception as e:
        print("Error:", e)

Important Notes:

  • This is a simplified example and requires the cv2 (OpenCV) and requests libraries. Install them using pip install opencv-python requests.
  • Ensure Ollama is running and the specified model (gemma-3-27B) is downloaded.
  • The image data is encoded as base64 for compatibility with Ollama’s API. Adjust this if your VLM requires a different format.
  • Error handling is minimal; implement more robust error checking in a production environment.

System Flow Diagram: Whisper to Piper via Ollama

Here’s a flow diagram illustrating the complete system architecture:

This diagram highlights the key components and data flow: a motion sensor triggers image capture, which is then processed by Ollama to generate a costume description and greeting. Piper TTS converts the text into audio, delivered through Skelly’s speaker. Whisper processing detects the “trick or treat” wake word, initiating the process.

Conclusion: Building Secure & Engaging Edge AI Experiences

The Halloween night debacle served as a valuable learning experience. While the initial vision was ambitious, it lacked the necessary controls and security measures for a real-world deployment. By focusing on controlled interaction, robust security practices, and leveraging the power of local VLMs like those available through Ollama or LM Studio, we can create engaging and privacy-focused edge AI experiences that are both fun and secure. The key is to anticipate potential challenges, prioritize user safety, and build a system that’s resilient against both accidental mishaps and malicious attacks. The future of animatronics powered by local VLM is bright – let’s make sure it’s also safe!

Harden Access Gateways with Geofencing – A Practical Guide for Enhanced Perimeter Security

In today’s distributed world the perimeter of an organization is no longer a single firewall at the edge of a data center. Users, services and devices connect from any location, over public clouds, through orchestrated container networks and via remote VPNs. This shift has driven security teams to adopt Zero Trust, a model that validates not only who you are but also where your device is located and what network it uses before granting access to critical resources.

One of the most effective ways to add an additional layer of verification is geofencing – the practice of allowing or denying traffic based on its geographic or network attributes such as country, city, or Autonomous System Number (ASN). When combined with strong device authentication (for example Device Trust certificates) geofencing can dramatically reduce the attack surface of your Device Trust access gateways.

This post explains how to harden access gateways with geofencing using Nginx and the ngx_http_geoip2_module. We will walk through obtaining free GeoIP data from MaxMind, configuring Nginx as a reverse proxy that blocks traffic by ASN, integrating geofence policies into modern identity providers like Authentik, and visualizing a example secure access flow. The examples are designed for Linux environments but can be adapted to any container or cloud platform.

Why Device Trust Matters in Modern Cloud Environments

  • Devices now connect from home offices, coffee shops, mobile networks and public clouds.
  • Attackers often use compromised devices or rented cloud instances that appear legitimate.
  • Traditional username/password/MFA checks do not verify the legitimacy of the device itself.
  • Adding a location check/monitoring makes it much harder for an adversary to reuse stolen credentials from an unexpected region.

When you combine device certificates, modern identity federation, and geofencing, you create a zero trust style gateway that only accepts traffic that meets all three criteria:

  1. Valid client certificate issued by your private Device Trust CA.
  2. Successful authentication with your Identity Provider (IdP).
  3. Source IP (or X-Forward-For) belongs to an expected country, city or ASN.

If any of these checks fail, the request is dropped before it reaches downstream services.

The Role of Geofencing in Hardening Access Gateways

Geofencing works by mapping an incoming IP address to a set of attributes – usually:

  • Country code (ISO‑3166 two‑letter format).
  • City name, coordinates and accuracy radius.
  • Autonomous System Number (ASN) which identifies the ISP or network owner.

These mappings are provided by public databases such as MaxMind’s GeoLite2. Because the data is freely available, you can implement geofencing without paying for a commercial service. The key steps are:

  1. Download and regularly update the GeoIP database.
  2. Load the database into your reverse proxy (Nginx in this example).
  3. Define rules that allow or deny traffic based on the mapped attributes.
  4. (Optional) Combine those rules with device certificate validation and IdP User attributes .

Getting Started with GeoIP Data Sources

MaxMind offers three primary free databases:

  • GeoLite2‑Country – maps IP to country code.
  • GeoLite2‑ASN – maps IP to ASN number and organization name.
  • GeoLite2-City– maps IP to City name as well as latitude, longitude, and accuracy radius.

Note: There are other free and paid providers of MaxMind (.mmdb) geolocation databases, which also should integrate into the same tooling without issue. Some great options are ipinfo lite, iplocate free, and ip2location lite.

You can obtain them by creating a free MaxMind account, accepting the license, and downloading the .mmdb files. To keep the data fresh you should schedule regular updates (MaxMind releases new versions weekly). The open source tool geoipupdate automates this process:

# Install geoipupdate on Debian/Ubuntu
apt-get update
apt-get install -y geoipupdate

# Create /etc/GeoIP.conf with your account details
cat <<EOF | sudo tee /etc/GeoIP.conf
AccountID YOUR_ACCOUNT_ID
LicenseKey YOUR_LICENSE_KEY
EditionIDs GeoLite2-Country GeoLite2-City GeoLite2-ASN
EOF

# Run the update immediately and enable a daily cron job
sudo geoipupdate

The resulting files are typically stored in /var/lib/GeoIP/ as GeoLite2-Country.mmdb and GeoLite2-ASN.mmdb. Adjust the paths in your Nginx configuration accordingly.

Installing and Configuring ngx_http_geoip2_module

The ngx_http_geoip2_module is a third‑party module that provides fast lookups of GeoIP data inside Nginx. It works with both the open source and commercial versions of Nginx, but for most Linux distributions you will need to compile it as a dynamic module.

#Install Build Prerequisites
apt-get install -y build-essential libpcre3-dev zlib1g-dev libssl-dev libmaxminddb-dev  nginx wget git vim

#Download Nginx Source and the GeoIP2 Module
NGINX_VERSION=$(nginx -v 2>&1 | cut -d'/' -f2| cut -d' ' -f1)
wget http://nginx.org/download/nginx-${NGINX_VERSION}.tar.gz
tar xzf nginx-${NGINX_VERSION}.tar.gz
git clone https://github.com/leev/ngx_http_geoip2_module.git

#Compile the Module as a Dynamic Loadable Object
cd nginx-${NGINX_VERSION}
./configure --with-compat --add-dynamic-module=../ngx_http_geoip2_module
make modules
cp objs/ngx_http_geoip2_module.so /usr/share/nginx/modules/
echo "load_module modules/ngx_http_geoip2_module.so;" > /etc/nginx/modules-available/mod-http-geoip2.conf
ln -s /etc/nginx/modules-available/mod-http-geoip2.conf /etc/nginx/modules-enabled/60-mod-http-geoip2.conf


#Enable and configure the Module by adding the following to the http section
vim /etc/nginx/nginx.conf

'''
geoip2 /var/lib/GeoIP/GeoLite2-Country.mmdb {
    auto_reload 60m;
    $geoip2_metadata_country_build metadata build_epoch;
    $geoip2_country_code country iso_code;
    $geoip2_country_name country names en;
}

geoip2 /var/lib/GeoIP/GeoLite2-City.mmdb {
    auto_reload 60m;
    $geoip2_metadata_city_build metadata build_epoch;
    $geoip2_city_name city names en;
}

fastcgi_param COUNTRY_CODE $geoip2_country_code;
fastcgi_param COUNTRY_NAME $geoip2_country_name;
fastcgi_param CITY_NAME    $geoip2_city_name;
'''

Now you can use the GeoIP2 country or city variables or create custom directives inside your server blocks.

Blocking Traffic by Country with Nginx GeoIP2

A simple config that only allows traffic from the United States and Canada might look like this:

http {

    map $geoip2_country_code $allowed_country {
        default no;
        US      yes;
        CA      yes;
    }

    server {
        listen 443 ssl;
        server_name gateway.example.com;

        # TLS configuration omitted for brevity

        if ($allowed_country = no) {
            return 403;
        }

        location / {
            proxy_pass http://backend;
        }
    }
}

This configuration uses the $geoip2_country_code variable loaded from the geoip2 via the main nginx config. The map block creates a boolean variable $allowed_country that is later used in an if statement to reject disallowed traffic with HTTP 403.

ASN Based Geofence on an Nginx Reverse Proxy

Blocking by ASN provides finer granularity than country alone, especially when you want to restrict access to corporate ISP ranges or known cloud providers. Below is a more advanced configuration that:

  • Allows only devices originating from your corporate ASN (e.g., AS12345) or a trusted cloud provider (AS67890).
  • Requires a valid client certificate signed by your internal CA.
  • Sends the authenticated request to an internal API gateway.
http {
    # Load both country and ASN databases
    geoip2 /var/lib/GeoIP/GeoLite2-ASN.mmdb {
        auto_reload 5m;
        $geoip2_asn_number asn asn;
        $geoip2_asn_org asn organization;
    }

    # Define the list of permitted ASNs
    map $geoip2_asn_number $asn_allowed {
        default          no;
        12345            yes;   # Corporate ISP
        67890            yes;   # Trusted Cloud Provider
    }

    server {
        listen 443 ssl;
        server_name api-gateway.example.com;

        # TLS configuration (certificate, key) omitted for brevity

        # Enforce mutual TLS – reject if no cert or invalid cert
        ssl_verify_client on;
        ssl_client_certificate /etc/nginx/certs/ca.crt; # Your CA Chain

        # If client certificate verification fails, Nginx returns 400 automatically.
        # Add an explicit check for ASN after TLS handshake:
        if ($asn_allowed = no) {
            return 403;
        }

        location / {
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Client-Cert $ssl_client_cert;
            proxy_set_header X-Client-DN  $ssl_client_s_dn
            proxy_pass http://internal-app;
        }
    }
}

Explanation of key parts

  • $geoip2_asn_number is populated by the GeoIP2 lookup; the map block translates the ASN into a simple yes/no flag.
  • The if ($asn_allowed = no) clause blocks any request that does not originate from an allowed ASN, even if the client certificate is valid.

You can extend this pattern to include city-level checks ($geoip2_city_name) or combine multiple criteria with logical operators.

Integrating GeoIP Policy with Authentik IDP

Authentik is a modern open source identity provider that supports OIDC, SAML and LDAP. It can enforce additional policies during the authentication flow, such as requiring a specific claim that matches your geofence rules.

Enable the GeoIP Policy within Authentik

Since Authentik version 2022.12 GeoIP is baked in and only requires mmdb files provided during startup for policies to be enabled.

  • Upload the same GeoLite2-City.mmdb & GeoLite2-ASN.mmdb files used for Nginx.
  • Provide the GeoIP (city mmdb) and ASN (ASN mmdb) file paths as env variables during startup
  • Setup schedule to update the files or configure geoipupdate plugin/container with your license key.

Now every authentication request will have an ASN value attached, which can be referenced in policies.

Create a GeoIP Polices in Authentik

In the Authentik admin UI:

  1. Navigate to Customizations → Policies.
  2. Add a new GeoIP Policy named “GeoIP Default”.
  3. Configure the default Distance and Static settings based on your needs

Distance Settings:

  • Maximum distance – The maximum distance allowed in Kilometers between logins
  • Distance tolerance – The allowable difference to account for data accuracy
  • Historical Login Count – The number of past logins to account for when evaluating
  • Check impossible travel – Whether to check logins/sessions for impossible travel >1k

Static Settings

  • Allowed ASNs – Comma separated list of all of the ASNs allowed to for the given policy
  • Allowed Countries – List of countries the policy allows connections from

(Optional) Create a Custom Policy in Authentik

In the Authentik admin UI:

  1. Navigate to Customizations → Policies.
  2. Add a new Expression Policy named “GeoIP ASN Allowlist”.
  3. Use the following Jinja2 expression (replace the ASNs with your allowed values):
{% set allowed_asns = [12345, 67890] %}
{{ context["asn"]["asn"] in allowed_asns and context["geoip"]["continent"] == "NA" }}

The context[“asn”] attribute is automatically populated by Authentik when the GeoIP ASN database and context[“geoip”] is provided by GeoIP City database. Both are used in conjunction here to required connections from an approved ASN network and from North America.

Attach the Policy to Your OIDC Application

  1. Open Applications → Your API Gateway.
  2. Under Policy Binding, add the “GeoIP Default” policy.
  3. (Optional) Under Policy Binding, add the “GeoIP ASN Allowlist” policy.
  4. Save changes.

When a user authenticates via Authentik, the flow will evaluate the policy. If the source IP belongs to an unauthorized ASN, authentication fails and no token is issued. This adds a second line of defense: even if an attacker obtains valid client certificates, they cannot get a JWT unless they connected from an allowed network.

Enforce Token Claims at Nginx

You can configure Nginx to validate the JWT issued by Authentik and also verify that it contains the expected ip_asn claim. The ngx_http_auth_jwt_module (available in the open source version) can be used:

http {
    # Load GeoIP2 as before

    server {
        listen 443 ssl;
        server_name api-gateway.example.com;

        # TLS settings omitted

        auth_jwt "Protected API";
        auth_jwt_key_file /etc/nginx/jwt-public.key;   # Authentik public key
        auth_jwt_claim_set $jwt_asn ip_asn;

        # Reject if JWT claim does not match allowed ASN list
        map $jwt_asn $jwt_asn_allowed {
            default no;
            12345   yes;
            67890   yes;
        }

        if ($jwt_asn_allowed = no) {
            return 403;
        }

        location / {
            proxy_pass http://internal-api;
        }
    }
}

The flow now looks like this:

  1. TLS handshake verifies client certificate.
  2. Nginx extracts the source IP and performs GeoIP ASN lookup.
  3. The request is redirected to Authentik for OIDC authentication.
  4. Authentik checks the “GeoIP ASN Allowlist” policy; if it passes, a JWT containing ip_asn is returned.
  5. Nginx validates the JWT and ensures the claim matches the allowed list before proxying to the backend.

This combination of device trust certificates, geofence enforcement and IdP policies creates a robust zero‑trust perimeter around your sensitive services.

Best Practices for Maintaining Geofencing Rules

  • Regularly update GeoIP databases – use geoipupdate with cron or systemd timers.
  • Keep an audit log of denied requests – configure Nginx error logs to capture $remote_addr, $geo_country_code, $geo_asn_number and the reason for denial.
  • Use a allowlist rather than a blocklist – allow only known good ASNs/countries; attackers can easily spoof or route through VPNs that belong to allowed regions.
  • Combine with rate limiting – even legitimate IP ranges may be abused; use limit_req_zone and limit_conn_zone.
  • Test changes in a staging environment – a mis‑configured ASN map could lock out all users out of an application, including administrators.
  • Monitor for anomalies – sudden spikes of traffic from an unexpected ASN can indicate compromised credentials.

Access Flow Diagram

Below is flowchart that visualizes the hardened access methodology. Only devices presenting a valid client Device Trust certificate and originating from an allowed location/ASN are permitted to obtain an authentication token and reach the protected service.

harden access gateways with geofencing

The diagram highlights three independent checks:

  • TLS client certificate – ensures the device holds a trusted private key.
  • GeoIP ASN validation at Nginx – blocks traffic from unknown networks before any authentication attempt.
  • Authentik policy enforcement and JWT claim verification – guarantees that the token itself reflects an allowed source network and travel distance is within tolerance.

Only when all three conditions succeed does the request reach the backend service.

Monitoring and Auditing Geofence Enforcement

A hardened gateway is only as good as its visibility. Implementing robust logging and alerting helps you detect misconfigurations or active attacks.

Nginx Log Format Extension

Add a custom log format that captures GeoIP variables:

http {
    log_format geo_combined '$remote_addr - $remote_user [$time_local] '
                            '"$request" $status $body_bytes_sent '
                            '"$http_referer" "$http_user_agent" '
                            'asn=$geoip2_asn_number country=$geoip2_country_code';

    access_log /var/log/nginx/access_geo.log geo_combined;
}

Centralized Log Collection

  • Ship logs to Elasticsearch, Splunk or Loki.
  • Create dashboards that filter on status=403 and group by $geoip2_asn_number.
  • Set alerts for spikes in denied traffic from a single ASN.

Scaling Geofence Enforcement Across Multiple Gateways

In large environments you may have dozens of ingress controllers. To keep policies consistent:

  • Store the allowed ASN list in a central source (e.g., Consul KV, etcd, or a ConfigMap).
  • Use a templating engine like envsubst or Helm to generate Nginx configs on each node.
  • Automate database updates with a CI/CD pipeline that pulls the latest MaxMind files and pushes them to all pods.

By treating the geofence policy as code you can version it, review changes via pull requests, and roll back quickly if an error blocks legitimate traffic.

Conclusion

Geofencing is a powerful yet straightforward technique for hardening access gateways with geofencing. By leveraging free GeoIP data from MaxMind, the ngx_http_geoip2_module in Nginx, and modern identity providers such as Authentik, you can enforce policies that require:

  • A trusted device certificate.
  • An allowed source network identified by ASN.
  • Successful authentication with a policy‑aware IdP.

The layered approach dramatically reduces the attack surface for privileged services, makes credential theft less useful, and gives security teams clear visibility into who is trying to connect from where. Combined with automated updates, logging, and containerized deployment, geofence enforcement can scale across hybrid cloud environments without adding significant operational overhead.

Start by downloading the GeoLite2 databases, compile the GeoIP2 module for your Nginx instances, define a allowlist of ASNs that correspond to your corporate trust network and approved cloud providers, and integrate the policy into your IdP. From there monitor the logs, tune the allowlist as your network evolves, and you’ll have a robust zero‑trust perimeter protecting your most sensitive workloads.

Remember: Device trust plus geofencing equals stronger security – and with the tools described in this post you can implement it today on Linux, cloud, and container platforms.

Harden Device Trust with Token Permissions: Preventing Subversion with GitHub Personal Access Tokens

Device Trust is rapidly becoming a cornerstone of modern security strategies, particularly within software development lifecycles. By ensuring that code changes are initiated from trusted devices, organizations can significantly reduce the risk of supply chain attacks and unauthorized modifications. However, a critical vulnerability often overlooked lies in the potential for users to bypass these controls using Personal Access Tokens (PATs). This blog post will delve into how attackers can leverage PATs to subvert Device Trust mechanisms, and more importantly, how you can harden Device Trust with token permissions through robust management practices.

Why PATs Are a Threat to Device Trust

AspectTraditional Device Trust (Web UI)PAT‑Based Access
Authentication pointBrowser session tied to SSO and device compliance checksDirect API call with static secret
VisibilityUI logs, conditional access policiesAPI audit logs only; may be ignored
Revocation latencyImmediate when device is non‑compliantRequires token rotation or explicit revocation
Scope granularityOften coarse (read/write) per repositoryFine‑grained scopes (e.g., pull_request:writerepo:status)

A PAT can be generated with any combination of scopes that the user’s role permits. When a developer creates a token for automation, they may inadvertently grant more privileges than needed, especially if the organization does not enforce fine‑grained tokens and approvals. The result is a secret that can be used from any machine, managed or unmanaged, effectively sidestepping Device Trust enforcement.

Real‑World Consequence

Imagine an attacker who gains access to a developer’s laptop after it is stolen. They locate the file ~/.git-credentials (or a credential helper store) and extract a PAT that includes pull_request:write. Using this token they can:

  1. Pull the latest code from any repository.
  2. Approve a malicious pull request without ever opening the controlled web UI.
  3. Merge the PR, causing malicious code to flow into production pipelines.

Because the action occurs via the API, the organization’s monitoring solution sees no violation, no unmanaged device attempted to open the GitHub website. The only evidence is an audit‑log entry that a token performed the operation, which may be missed if logging and alerting are not tuned for PAT usage.

Attack Flow: Bypassing Device Trust with PATs

Let’s illustrate how an attacker might exploit this vulnerability using a GitHub example. This flow can be adapted to other platforms like GitLab, Azure DevOps, etc., but the core principles remain consistent.

Explanation:

  1. Attacker Obtains Compromised PAT: This could happen through phishing, malware, credential stuffing, or insecure storage practices by the user.
  2. GitHub API Access: The attacker uses the stolen PAT to authenticate with the GitHub API.
  3. Forge Pull Request: The attacker creates a pull request containing malicious code changes.
  4. Approve Pull Request (Bypass Device Trust): Using the API, the attacker approves the pull request without going through the standard Device Trust verification process. This is the critical bypass step.
  5. Merge Changes to Main Branch: The approved pull request is merged into the main branch, potentially introducing malicious code into production.

The “Device Trust Workflow” subgraph shows the intended secure path. Notice how the attacker completely circumvents this path by leveraging the PAT directly against the API.

Leveraging gh cli and the GitHub API with PATs

Attackers or savvy users don’t need sophisticated tools to exploit PATs. The readily available gh cli (GitHub Command Line Interface) or simple scripting using curl can be used effectively.

Approving a Pull Request with gh cli:

Assuming you have the PAT stored in an environment variable GITHUB_TOKEN:

# Export the stolen token into an environment variable (or store it in ~/.config/gh/config.yml)
export GH_TOKEN=ghp_XXXXXXXXXXXXXXXXXXXXXXXXXXXX

# Authenticate gh with the token (no interactive login required)
gh auth status  # verifies that the token is valid

# List open pull requests for a target repository
gh pr list --repo AcmeCorp/webapp --state open

# Approve and merge a specific PR (ID = 42)
gh pr review 42 --repo AcmeCorp/webapp --approve --body "Looks good to me!"
gh pr merge 42 --repo AcmeCorp/webapp --merge 

All of these actions are performed via the GitHub API behind the scenes. These simple commands bypass any Device Trust checks that would normally be required when approving a pull request through the web interface.

Approving a Pull Request with curl:

# Variables
TOKEN="ghp_XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
OWNER="AcmeCorp"
REPO="webapp"
PR_NUMBER=42

# Submit an approval review
curl -X POST \
  -H "Authorization: token $TOKEN" \
  -H "Accept: application/vnd.github+json" \
  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/reviews \
  -d '{"event":"APPROVE"}'

# Merge the pull request
curl -X PUT \
  -H "Authorization: token $TOKEN" \
  -H "Accept: application/vnd.github+json" \
  https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/merge \
  -d '{"merge_method":"squash"}'

If the token includes pull_request:write permission scope, both calls succeed, and the attacker has merged malicious code without ever interacting with the controlled web flow.

Hardening Device Trust: Token Management Strategies

The key to mitigating this risk lies in proactive token management and granular permission control. Here’s a breakdown of strategies you can implement:

Disable PATs Where Possible:

This is the most secure approach, but often impractical for organizations heavily reliant on automation or legacy integrations. However, actively identify and eliminate unnecessary PAT usage. Encourage users to migrate to more secure authentication methods like GitHub Apps where feasible.

GitHub now offers Fine-Grained Personal Access Tokens (FG-PATs) which allow you to scope permissions down to specific repositories and even individual resources within those repositories. This is a significant improvement over classic PATs, but still requires careful management.

Implement Organization-Level Policies:

GitHub provides features for managing PAT usage at the organization level:

  • Require FG-PATs: Enforce the use of Fine-Grained Personal Access Tokens instead of classic PATs.
  • Restrict Token Creation: Limit who can create PATs within the organization. Consider restricting creation to specific teams or administrators.
  • Require Administrator approval: Requires an Administrators to approve the token and scope before being usable.
  • Token Expiration Policies: Set a maximum expiration time for all PATs. Shorter lifespans reduce the window of opportunity for attackers if a token is compromised.
  • IP Allowlisting (GitHub Enterprise): Restrict PAT usage to specific IP address ranges, limiting access from known and trusted networks.

GitHub introduced Fine‑grained personal access tokens (FGPATs) that let administrators define which repositories a token can access and what actions it may perform. To require FGPATs, enable the “Restrict access via personal access tokens (classic)” option in Organization Settings → Personal Access Tokens → Settings → Tokens (classic)

Focus on Repository-Level Scopes and Require Approval :

In addition to restricting the use of classic Personal Access Tokens, try to utilize Github apps and/or Oauth for access as they offer a far more robust set of configuration and controls for autonomous workloads. If still need to leverage Fine-Grain Personal access tokens, limit them to a target set of repo(s), require administrator approval, and set a maximum expiration date to limit exposure.

This provides more granular control over permissions and allows for active review/approval:

  • Restrict pull_request:write Permission: The pull_request:write permission is particularly dangerous as it allows users to approve pull requests without Device Trust verification. Consider removing this permission from PATs unless absolutely necessary.
  • Least Privilege Principle: Grant only the minimum permissions required for each PAT. Avoid broad “repo” scope access whenever possible. FG-PATs make this much easier.
  • Code Owners Review: Enforce code owner reviews on all pull requests, even those approved via API. This adds an extra layer of security and helps detect malicious changes.

Token Auditing and Monitoring:

  • Regularly Review PAT Usage: Identify unused or overly permissive tokens.
  • Monitor API Activity: Look for suspicious activity, such as unexpected pull request approvals or changes made outside of normal working hours. GitHub provides audit logs that can be integrated with SIEM systems.
  • Automated Scanning: Use tools to scan code repositories and identify hardcoded PATs.

User Education:

Educate developers about the risks associated with PATs and best practices for secure token management, including:

  • Never commit PATs to source control.
  • Use strong passwords and multi-factor authentication.
  • Rotate tokens regularly.
  • Report any suspected compromise immediately.

Conclusion

Device Trust is a vital security component, but it’s not a silver bullet. Attackers will always seek the path of least resistance, and PATs represent a significant vulnerability if left unmanaged. By implementing robust token management strategies – including disabling unnecessary PATs, enforcing granular permissions, and actively monitoring API activity – you can harden Device Trust with token permissions and significantly reduce your risk of supply chain attacks. Remember that security is a layered approach; combining Device Trust with strong token controls provides the most comprehensive protection for your software development lifecycle.