Rob T. Lee Manager • 15 days ago
New Case Data To Try Against
Anna Tchijova pulled together ten public DFIR datasets you can point your agent at beyond the starter corpus, then we ran every link through a verification pass. Credit to her, she did this on a slow connection over two days and built VIGIA ground-truth files for some of it. Below is the verified list, ranked for how usable each one actually is in this hackathon.Read this part first, because it decides which cases you should and shouldn't trust:Validation cases have ground truth you can score against, and that ground truth is either gated or structured (a NIST answer key, a password-protected teacher solution, instructor answers on request). Benchmark on these and trust the number.Practice cases have solutions published all over the internet, which means they're almost certainly in the model's training data. Great for building. Weak for scoring, because you can't tell whether the agent reasoned to the answer or just recalled it. If your accuracy report leans on a practice case, say so.That split drives the ranking. Most usable first.Tier 1, score against these:
1. Nitroba University Harassment (Digital Corpora). Network forensics: a professor gets harassing email, a tap catches the traffic, you attribute the sender through plaintext Gmail cookies and an anonymous email service. The single strongest scoring case in the set. All three hashes published and confirmed matching against the live 56MB pcap, plus
a password-protected teacher solution. (When a case validates this cleanly, use it.)
https://digitalcorpora.org/corpora/scenarios/nitroba-university-harassment-scenario/2. NIST Data Leakage Case (Iaman Informant). A dev manager exfiltrates files over email, personal cloud, USB, and CD-R while running anti-forensic cleanup. Windows 7 disk plus small-media images. Ships with a step-by-step answer key and timeline, grades cleanly, and the insider-threat shape maps almost directly onto our Dungan starter case.
https://cfreds.nist.gov/all/NIST/DataLeakageCase3. NIST Hacking Case (Greg Schardt / Mr. Evil). A seized Windows XP laptop from a war-driver who sniffed WiFi at public hotspots to steal credentials. Published image MD5 matches the VIGIA artifact and there's a NIST answer key, so it self-validates. One caveat: it's so heavily documented online that writeups are everywhere, so read any score through that lens.
https://cfreds.nist.gov/all/NIST/HackingCaseTier
2, strong build-and-test cases (answers gated, so usable for scoring with care):
4. Ali Hadi #9: Encrypt Them All. A subject layers AES, BitLocker, and GPG to conceal communication, and VIGIA deliberately marks this SUSPICION, not MALICE. The only case in the set built to test whether an agent over-calls intent on lawful encryption, which is exactly the false-positive behavior judges weight. Punches above its rank.
https://archive.org/details/anti-forensics-case-2
5. Ali Hadi #1: Web Server Compromise. Windows Server 2008 plus XAMPP breached via SQL injection, then webshells, account creation, and RDP. Ships as both a disk image and a memory image (memdump.7z 110MB, full image 1.4GB) with hashes in the item. The disk-and-memory pairing from one host is the cleanest fit for the multi-source correlation track. Instructor answers on request.
https://archive.org/details/dfir-case1
6. DFRWS 2008 Linux Memory Challenge. A CentOS scenario: an employee copies files off an admin share, escalates with a downloaded exploit, exfiltrates through an external HTTP proxy. Memory, disk, and network in one case. Three evidence types plus Linux coverage make it a strong breadth test, and Anna already built a VIGIA ground-truth file. All evidence is in-repo (94MB zip), which is why it beats the cases below it.
https://github.com/dfrws/dfrws2008-challengeTier3, practice cases (build on them, don't score on them):
7. M57-Jean (Digital Corpora). A CFO is socially engineered into exfiltrating a salary spreadsheet, the tell being a Reply-To that shifts between two spoofed emails. Both E01/E02 confirmed live (1.5GB + 1.4GB). Solutions are widely published, so this is a strong practice case and a weak scoring case.
https://digitalcorpora.org/corpora/scenarios/m57-jean/
8. Ali Hadi #7: SysInternals Malware. A user runs a fake SysInternals installer that's actually a downloader; it edits the hosts file, pulls a payload, installs a fake VMware service for persistence (built with Harlan Carvey). Single 7.2GB E01 with a hash sidecar. Solid malware-and-persistence case, answers instructor-gated.
https://archive.org/download/sysinternals-caseTier 4, not ready, do not rely on these for scoring yet:
9. DFRWS 2011 Android. Android flash-memory analysis across two scenarios including a mobile-malware espionage thread. Two problems. The evidence lives on a personal Dropbox, not GitHub or dfrws.org, so the link can vanish without notice. And the README's hashes are labeled MD5 but are actually 40-character SHA1 strings, mislabeled upstream. Don't chase a phantom mismatch. We need to mirror both files (~164MB and ~355MB) and recompute hashes before this is safe to use.
https://github.com/dfrws/dfrws2011-challenge
10. Volatility Cridex (memory sample). cridex.vmem, the staple Windows XP memory sample infected with the Cridex banking trojan. Highest demand, lowest readiness. The canonical download is confirmed dead (403), the Volatility repo went read-only in May 2025 so the broken pointer will never be fixed, and there's no published known-good hash. Cannot be a scoring case until someone hosts a verified copy. Anna offered to host one with published hashes after her fiber install.
[link once hosted]
If you're building toward the accuracy-benchmark or self-correcting-triage ideas, the VIGIA ground-truth files Anna built (Cridex, DFRWS 2008) plus the Dungan case in the starter corpus give you the template to score against. That's your head start on starter project #5.Two of these need community muscle, not just a download: a SANS-hosted Cridex mirror with published hashes, and a mirror of the two DFRWS 2011 Android files with recomputed MD5 and SHA256. If you've already got a clean local copy of either with a hash, post it here so we're not all re-pulling the same images.Built by the community, for the community is only true if we share the boring parts too.
Log in or sign up for Devpost to join the conversation.

0 comments