The deep scan engines: Opengrep, picklescan, Gitleaks, and osv-scanner
The deep scan engines: Opengrep, picklescan, Gitleaks, and osv-scanner
Beyond the heuristic scanner layers, ai-supply.store runs four purpose-built security engines on uploaded artifacts. These are the same tools security engineers use in professional AppSec pipelines — and they run on every submission, for free.
Opengrep — AST analysis and taint tracking
Opengrep is an open-source fork of Semgrep that performs abstract syntax tree (AST) analysis rather than simple regex matching. This means it understands the structure of code, not just its text.
What it catches:
- Taint flows: user input → SQL query → database call (SQL injection)
- Taint flows: user input → shell command (command injection)
- Taint flows: external data →
eval()(code injection) - Insecure cryptographic primitive usage
- Path traversal patterns (
../../)
Example finding:
[HIGH] taint-sink: user_input flows to subprocess.call() at server.py:42
Rule: python.lang.security.dangerous-subprocess-use-audit
How to pass cleanly: Validate and allowlist all inputs before they reach dangerous sinks. Prefer library APIs over shell invocations. Opengrep's rulesets are public — you can run it locally before uploading:
npx opengrep scan --config auto .
picklescan — model malware detection
picklescan is a purpose-built scanner for detecting malicious pickle files. Pickle is Python's default serialisation format and it is fundamentally unsafe: loading a pickle file is equivalent to executing the code it contains.
What it catches:
REDUCEopcodes calling dangerous globals (os.system,subprocess.Popen,eval)- Stack manipulation that constructs callable objects at load time
- Multi-stage payloads that obfuscate execution via opcode sequencing
Example finding:
Pickle REDUCE opcode with dangerous callable: os.system
File: model_weights.pkl — MALICIOUS
How to pass cleanly: Use safetensors instead of pickle. If you must use pickle (e.g., for scikit-learn pipelines), restrict callables to known-safe classes with a custom Unpickler and document this clearly. The scanner knows the difference between a torch.save() of tensor weights and a payload disguised as one.
A safe listing to check out: LLM Guard, which uses safetensors throughout.
Gitleaks — deep secrets scanning
Gitleaks scans every file in the artifact tree for high-entropy strings matching credential patterns. It uses a rule library covering 150+ secret types:
- OpenAI, Anthropic, Cohere, Hugging Face API keys
- AWS/GCP/Azure access keys
- GitHub/GitLab personal access tokens
- Stripe, Twilio, Sendgrid keys
- Generic high-entropy strings (base64, hex)
- Private key PEM blocks
It scans everything: source files, test files, config files, comments, .env.example, Jupyter notebooks, inline JSON.
Example finding:
RuleID: openai-api-key
File: src/config.ts:14
Secret: sk-proj-XXXXXXXXXXXXXXXXXXXXXXXXXX
Commit: [embedded in artifact]
How to pass cleanly:
# Run locally before uploading:
gitleaks detect --source . --no-git
Rotate any real credentials that were accidentally committed. Replace with env var references. Gitleaks won't flag OPENAI_API_KEY=your_key_here (placeholder) but will flag any string that looks like a real credential.
osv-scanner — dependency CVE lookup
osv-scanner by Google queries the Open Source Vulnerability (OSV) database — the authoritative cross-ecosystem CVE registry covering npm, PyPI, Go, Maven, Cargo, and more.
What it scans:
package.json+package-lock.json/yarn.lockrequirements.txt/pyproject.toml/poetry.lockgo.mod/go.sumCargo.toml/Cargo.lock
Severity mapping to listing level:
| Severity | Effect |
|---|---|
| CRITICAL (CVSS ≥ 9.0) | Pushes toward QUARANTINE |
| HIGH (CVSS 7.0–8.9) | Pushes toward REVIEW |
| MEDIUM / LOW | Score deduction only |
How to pass cleanly:
# Install
go install github.com/google/osv-scanner/cmd/osv-scanner@latest
# Scan your project
osv-scanner scan --recursive .
Fix findings with npm audit fix (Node) or pip-audit --fix (Python). Pin exact versions — floating ranges (^1.2.0) allow silent upgrades that introduce new CVEs after your listing is live.
Running all four locally before upload
# In your project directory:
npx opengrep scan --config auto .
gitleaks detect --source . --no-git
osv-scanner scan --recursive .
picklescan -r . # if you have .pkl files
If all four pass locally, your upload almost certainly passes the platform scanner too. For the full nine-layer breakdown, see the nine-layer scanner: a deep dive.
And remember — all of this runs for free on every artifact you upload to ai-supply.store.