Introduction

In today’s hyper‑connected world, the sheer volume of files, URLs, and executables that cross network perimeters makes it impossible for any single organization to maintain an exhaustive, up‑to‑date signature database. Threat actors constantly mutate their payloads, and new malicious artifacts appear every few minutes. VirusTotal (VT) has emerged as a de‑facto community‑driven hub for aggregating the results of dozens of antivirus engines, URL scanners, and sandboxes into a single, searchable platform.

This guide dives deep into everything you need to know about VirusTotal:

  • What the service does and how it fits into a modern security stack
  • The public web interface versus the paid API tiers
  • Practical, real‑world use cases for analysts, SOC engineers, and developers
  • Step‑by‑step code examples (Python, Bash, PowerShell) for automating submissions and parsing results
  • Limitations, privacy considerations, and best‑practice recommendations

By the end of this article you’ll be equipped to integrate VirusTotal into your daily workflow, design automated pipelines for bulk analysis, and understand the nuances that separate a superficial scan from a thorough threat‑intelligence operation.


1. What Is VirusTotal?

VirusTotal is a free online service owned by Google (via Chronicle) that accepts files, URLs, IP addresses, and domain names for analysis. Once submitted, the platform forwards the artifact to a large pool of anti‑malware engines, URL scanners, sandbox environments, and reputation services. The results are then displayed in a unified dashboard, complete with:

Result TypeTypical Providers
Antivirus signaturesKaspersky, Bitdefender, McAfee, Sophos, etc.
Static hash look‑upsNSRL, VirusShare
Dynamic sandbox executionCuckoo, FireEye, Any.Run
URL reputationGoogle Safe Browsing, PhishTank
Reputation scoresWeb of Trust (WoT), Cisco Talos
Community commentsUser‑generated tags and observations

Because VirusTotal aggregates multiple independent viewpoints, a single detection can be corroborated (or refuted) across many engines, reducing false positives and increasing confidence.

1.1 Public vs. Private (API) Access

FeaturePublic Web UIPaid API (Professional/Enterprise)
Daily submission limit4 MB per file, 32 MB total per day per IPUp to 200 MB per file, higher quota per API key
AutomationManual uploads onlyProgrammatic access via REST endpoints
Advanced reportsBasic JSON viewFull JSON, PCAP, and sandbox video
Private scanningNo (all submissions are public)Option for private, non‑shared analysis
IntegrationBrowser extensions, CLI wrappersSDKs, SIEM connectors, custom scripts

The public interface is ideal for ad‑hoc analysis, learning, or low‑volume research. For organizations that need high throughput, private sharing, or historical data, the paid API tiers provide the necessary elasticity.


2. Core Concepts and Terminology

Before we jump into hands‑on examples, let’s clarify the key terms you’ll encounter when working with VirusTotal.

TermDefinition
HashCryptographic fingerprint (MD5, SHA‑1, SHA‑256) used to uniquely identify a file.
Scan IDA unique identifier returned after submitting an artifact; used to retrieve results later.
Analysis IDThe newer nomenclature for the same concept in v3 of the API.
VerdictThe overall classification (malicious, suspicious, clean, undetected) derived from engine votes.
Community scoreA rating contributed by VirusTotal users based on their own investigations.
BehaviorsObserved actions during sandbox execution (e.g., file writes, network connections).
YARA rulesPattern‑matching rules that can be applied to VT’s detections via the “YARA” endpoint (Enterprise).

Understanding these concepts is essential for interpreting the data returned by the platform, especially when building automated pipelines that need to make decisions based on the verdict.


3. Real‑World Use Cases

3.1 Incident Response (IR)

When a suspicious file lands on an endpoint, IR teams often:

  1. Submit the file to VirusTotal.
  2. Correlate the VT verdict with internal IOC (Indicator of Compromise) feeds.
  3. Extract sandbox behavior (e.g., registry changes) to inform containment actions.

Note: Because public submissions are visible to everyone, many IR teams use the private (Enterprise) API to keep malicious samples confidential.

3.2 Threat Hunting

Threat hunters can leverage VT’s massive historical dataset to:

  • Search for known malicious hashes that have appeared in past attacks.
  • Identify domains that have been flagged across multiple engines.
  • Use VT’s “Live Hunting” feature (Enterprise) to run YARA rules against the entire corpus in near‑real time.

3.3 Malware Research & Reverse Engineering

Researchers often:

  • Pull sandbox execution videos to see how a payload behaves without setting up a local environment.
  • Retrieve PE metadata (imports, exports) that VT extracts automatically.
  • Compare static signatures across multiple antivirus vendors to understand detection gaps.

3.4 DevSecOps & CI/CD Pipelines

In modern DevSecOps, it’s common to embed a VT scan into a CI job:

# Example GitHub Actions step
- name: Scan artifact with VirusTotal
  uses: actions/checkout@v3
- run: |
    curl -s --request POST \
      --url https://www.virustotal.com/api/v3/files \
      --header "x-apikey: ${{ secrets.VT_API_KEY }}" \
      --form file=@path/to/artifact.exe \
      -o vt_response.json
  env:
    VT_API_KEY: ${{ secrets.VT_API_KEY }}

If the scan returns a malicious verdict, the pipeline can be configured to fail the build, preventing the artifact from being promoted to production.


4. Getting Started with the VirusTotal API (v3)

VirusTotal’s current API version (v3) is REST‑ful and returns JSON responses. Below we walk through the basic workflow:

  1. Obtain an API key – Sign up at virustotal.com, then navigate to “My API key”.
  2. Upload a file – POST to /files.
  3. Poll for results – GET /analyses/{analysis_id} until the status is completed.
  4. Parse the verdict – Examine malicious, suspicious, harmless counts.

4.1 Prerequisites

  • Python 3.9+
  • requests library (pip install requests)
  • A valid VirusTotal API key (free tier: 4 requests/minute)

4.2 Python Example – Submitting and Retrieving a File Scan

import time
import hashlib
import requests
import json
from pathlib import Path

API_KEY = "YOUR_VT_API_KEY"
BASE_URL = "https://www.virustotal.com/api/v3"

def get_sha256(file_path: Path) -> str:
    """Calculate SHA‑256 hash for a local file."""
    h = hashlib.sha256()
    with file_path.open("rb") as f:
        for chunk in iter(lambda: f.read(8192), b""):
            h.update(chunk)
    return h.hexdigest()

def upload_file(file_path: Path) -> str:
    """Upload a file to VirusTotal and return the analysis ID."""
    url = f"{BASE_URL}/files"
    headers = {"x-apikey": API_KEY}
    files = {"file": (file_path.name, file_path.open("rb"))}
    response = requests.post(url, headers=headers, files=files)
    response.raise_for_status()
    analysis_id = response.json()["data"]["id"]
    print(f"Uploaded {file_path.name}, analysis ID: {analysis_id}")
    return analysis_id

def poll_analysis(analysis_id: str, interval: int = 15) -> dict:
    """Poll the analysis endpoint until the scan is finished."""
    url = f"{BASE_URL}/analyses/{analysis_id}"
    headers = {"x-apikey": API_KEY}
    while True:
        resp = requests.get(url, headers=headers)
        resp.raise_for_status()
        result = resp.json()
        status = result["data"]["attributes"]["status"]
        if status == "completed":
            return result
        print(f"Analysis {analysis_id} still in progress... waiting {interval}s")
        time.sleep(interval)

def main():
    file_path = Path("sample.exe")
    sha256 = get_sha256(file_path)
    print(f"SHA‑256: {sha256}")

    # First, check if the hash is already known (avoids re‑upload)
    lookup_url = f"{BASE_URL}/files/{sha256}"
    lookup_resp = requests.get(lookup_url, headers={"x-apikey": API_KEY})
    if lookup_resp.status_code == 200:
        print("Hash already present in VT, fetching existing report...")
        report = lookup_resp.json()
    else:
        analysis_id = upload_file(file_path)
        report = poll_analysis(analysis_id)

    # Pretty‑print a summary
    attrs = report["data"]["attributes"]
    stats = attrs["last_analysis_stats"]
    print("\n=== Verdict Summary ===")
    print(json.dumps(stats, indent=2))

    # List engines that flagged the file as malicious
    malicious_engines = [
        engine for engine, result in attrs["last_analysis_results"].items()
        if result["category"] == "malicious"
    ]
    print(f"\nEngines reporting malicious: {len(malicious_engines)}")
    for eng in malicious_engines:
        print(f"- {eng}")

if __name__ == "__main__":
    main()

Explanation of key steps:

  • Hash pre‑check – Saves API quota by avoiding duplicate uploads.
  • Polling – VT processes files asynchronously; the script loops until status == "completed".
  • Result parsinglast_analysis_stats gives a quick tally; last_analysis_results provides per‑engine details.

4.3 Bash / cURL Example – Quick URL Scan

#!/usr/bin/env bash
API_KEY="YOUR_VT_API_KEY"
URL="http://malicious.example.com"

# Submit URL for scanning
response=$(curl -s -X POST "https://www.virustotal.com/api/v3/urls" \
  -H "x-apikey: $API_KEY" \
  --data "url=$URL")

analysis_id=$(echo "$response" | jq -r '.data.id')
echo "Submitted URL, analysis ID: $analysis_id"

# Poll until ready
while true; do
  result=$(curl -s -X GET "https://www.virustotal.com/api/v3/analyses/$analysis_id" \
    -H "x-apikey: $API_KEY")
  status=$(echo "$result" | jq -r '.data.attributes.status')
  if [[ "$status" == "completed" ]]; then
    echo "Analysis complete."
    echo "$result" | jq '.data.attributes.last_analysis_stats'
    break
  else
    echo "Waiting for analysis... (status=$status)"
    sleep 10
  fi
done

Tip: The jq utility is invaluable for extracting fields from JSON responses in shell scripts.

4.4 PowerShell Example – Scanning a File in Windows Environments

# Requires PowerShell 7+ and Invoke-RestMethod
$ApiKey = "YOUR_VT_API_KEY"
$FilePath = "C:\Temp\sample.exe"
$BaseUri = "https://www.virustotal.com/api/v3"

# Compute SHA256
$Hash = Get-FileHash -Path $FilePath -Algorithm SHA256
$Sha256 = $Hash.Hash.ToLower()

# Check if hash already exists
$Lookup = Invoke-RestMethod -Method Get `
    -Uri "$BaseUri/files/$Sha256" `
    -Headers @{ "x-apikey" = $ApiKey } `
    -ErrorAction SilentlyContinue

if ($Lookup) {
    Write-Host "File already known. Displaying existing report..."
    $Report = $Lookup
} else {
    # Upload file
    $Upload = Invoke-RestMethod -Method Post `
        -Uri "$BaseUri/files" `
        -Headers @{ "x-apikey" = $ApiKey } `
        -Form @{ file = Get-Item $FilePath }
    $AnalysisId = $Upload.data.id

    # Poll for result
    do {
        Start-Sleep -Seconds 15
        $Status = Invoke-RestMethod -Method Get `
            -Uri "$BaseUri/analyses/$AnalysisId" `
            -Headers @{ "x-apikey" = $ApiKey }
        $State = $Status.data.attributes.status
        Write-Host "Current status: $State"
    } while ($State -ne "completed")

    $Report = $Status
}

# Summarize
$Stats = $Report.data.attributes.last_analysis_stats
Write-Host "`n=== Verdict Summary ==="
$Stats | ConvertTo-Json -Depth 3

5. Interpreting the Results

A raw VT JSON payload contains a wealth of data. Below we highlight the most actionable sections.

5.1 last_analysis_stats

{
  "harmless": 0,
  "malicious": 12,
  "suspicious": 3,
  "undetected": 50,
  "timeout": 0
}
  • Malicious > 0 – At least one engine flagged the sample.
  • Suspicious – Engines are unsure; often a heuristic detection.
  • Undetected – Majority of engines saw nothing; still not a guarantee of safety.

5.2 last_analysis_results

Each engine’s detailed verdict is stored here:

{
  "Kaspersky": {
    "category": "malicious",
    "engine_name": "Kaspersky",
    "engine_version": "21.3.0.1085",
    "result": "Trojan.Win32.Agent"
  },
  "Microsoft": {
    "category": "harmless",
    "engine_name": "Microsoft",
    "engine_version": "1.1.18500.11",
    "result": null
  }
}
  • Result field – Engine‑specific family name (useful for mapping to ATT&CK).
  • Category – One of malicious, suspicious, harmless, undetected, timeout.

5.3 Sandbox Behaviors (Enterprise)

Enterprise users have access to behavioral graphs and PCAP files that illustrate:

  • Files created or modified
  • Registry keys accessed
  • Network connections (domains, IPs, ports)

These can be exported via the /files/{id}/behaviour endpoint and fed into a YARA or Sigma rule engine for deeper correlation.

5.4 Community Tags and Comments

The public UI shows user‑submitted tags such as #ransomware, #banking-trojan. While not authoritative, they provide context on how the community perceives the sample.

Important: Always cross‑reference community tags with vendor detections; crowdsourced data can be noisy.


6. Best Practices for Using VirusTotal

  1. Never rely on a single engine – Use the aggregate verdict and look for consensus.
  2. Throttle API calls – Respect the rate limits (free tier: 4 req/min). Exceeding limits may result in temporary bans.
  3. Hash‑first workflow – Compute SHA‑256 locally before uploading; this saves quota and speeds up look‑ups.
  4. Sanitize sensitive data – Public submissions are visible to everyone; use the private API for PII or zero‑day samples.
  5. Combine with internal intel – Correlate VT results with your own threat‑intel feeds for richer context.
  6. Log every request – Store request IDs, timestamps, and raw responses for auditability.
  7. Automate remediation – Tie VT verdicts to SOAR playbooks (e.g., automatically isolate a host if a malicious file is detected).

7. Limitations and Caveats

LimitationImpactMitigation
Public submissions are sharedSensitive samples could be exposedUse Enterprise private scans or self‑hosted sandbox for zero‑day analysis
Engine coverage variesSome niche malware families may be missedSupplement VT with specialized sandboxes (e.g., Cuckoo, FireEye)
Rate limitsFree tier may be insufficient for high‑volume SOCsPurchase Professional/Enterprise API key
False positives/negativesEngines disagree; some may flag benign softwarePerform manual triage on borderline cases
File size cap (public)4 MB limit may prevent scanning large installersSplit large binaries, or use the paid tier which supports up to 200 MB

Being aware of these constraints helps you design a defense‑in‑depth approach where VirusTotal is a critical component, not the sole source of truth.


8. Advanced Topics

8.1 Bulk Hunting with the “Live Hunting” Feature (Enterprise)

Live Hunting lets you run YARA or Sigma queries against the entire VirusTotal dataset in near‑real time. Example YARA rule to find any PE executable that imports WinInet.dll:

rule WinInet_Import {
    meta:
        description = "Detects binaries importing WinInet"
        author = "YourName"
    condition:
        pe.imports("WinInet.dll")
}

You submit the rule via the /intelligence/hunting endpoint; VT returns a list of matching hash IDs, which you can then download for deeper analysis.

8.2 Integrating with SIEMs (Splunk, Elastic, QRadar)

Most SIEM platforms have pre‑built VirusTotal connectors. The typical flow:

  1. Forward file hashes from endpoint logs to a lookup script.
  2. The script queries VT (cached locally for speed).
  3. Enrich the event with malicious count and engine list.
  4. Alert if malicious >= 3 and the file originated from an untrusted source.

8.3 Using VT as a Threat‑Intel Source for ATT&CK Mapping

When a VT report flags a sample as Trojan.Win32.Agent, you can map it to ATT&CK technique T1059 – Command and Scripting Interpreter (if the sandbox shows PowerShell execution). Automating this mapping creates richer detection rules for your EDR.

# Pseudocode for ATT&CK mapping
if "PowerShell" in behaviors["processes"]:
    technique = "T1059.001"
elif "cmd.exe" in behaviors["processes"]:
    technique = "T1059.003"

9. Frequently Asked Questions (FAQ)

QuestionAnswer
Can I scan a URL without uploading a file?Yes. Use the /urls endpoint (public) or the /intelligence/search endpoint (Enterprise) to submit a URL directly.
What happens to my uploaded file?VirusTotal stores the file for 90 days (public) or 365 days (Enterprise) for re‑analysis and community sharing.
Is there a limit on the number of analyses per day?Free tier: 4 requests/minute, 500 requests per day. Paid tiers increase limits dramatically (up to 10 k/min for Enterprise).
Can I delete a file I uploaded?Public uploads cannot be deleted. Enterprise customers can request removal via the support portal.
Do antivirus vendors rely on VirusTotal?Many vendors use VT as a source of community intel, but they also run their own proprietary detection pipelines.

Conclusion

VirusTotal has evolved from a simple “upload‑and‑see” web tool into a cornerstone of modern cyber‑defense. Its ability to aggregate dozens of detection engines, provide sandbox behavior, and expose a robust API makes it indispensable for:

  • Incident responders seeking rapid triage
  • Threat hunters looking for historical context
  • DevSecOps engineers automating security gates in CI/CD pipelines
  • Researchers needing quick access to a massive sample repository

However, the platform is not a silver bullet. Effective use requires understanding its limits, respecting privacy considerations, and combining VT data with internal telemetry. By following the best practices outlined in this guide—hash‑first checks, proper rate‑limiting, automated enrichment, and thoughtful remediation—you can turn VirusTotal from a passive lookup service into an active, programmable intelligence engine that strengthens your organization’s security posture.

Stay curious, stay vigilant, and let VirusTotal be a trusted ally in the ever‑changing battle against malware.


Resources