I’m testing an idea: generate JDs from real engineering tasks, then evaluate candidates against those tasks using code evidence.
Instead of writing “5+ years X, Y, Z”, input is actual work context:
- GitHub/Jira issues - linked PRs and code diffs - review comments and discussion timeline - change size, dependencies, and failure modes
From this, the system generates a structured JD, for example:
- Problem Scope: what must be solved - Required Skills: APIs, infra, debugging depth, testing expectations - Seniority Signals: architecture ownership vs isolated implementation - Success Criteria: what “done well” looks like - Interview Focus: where to probe risk areas
Then candidate evaluation is also evidence-first:
- Build candidate activity profile from commit/PR/review history - Map past solved problems to the generated JD requirements - Score by evidence quality, not keyword match - Output traceable reasons like: - “Handled similar WebSocket memory leak with root-cause writeup” - “Strong delivery signal, but weak automated test coverage” - “No evidence for distributed locking incidents”
So the goal is not “resume parsing”, but:
1. JD grounded in actual tasks 2. Candidate fit grounded in actual shipped work
I’m still validating this concept and would value critical feedback:
- For hiring managers: would this be better than today’s JD + ATS flow? - For engineers: what would make this feel fair vs reductive? - Where does this break first (privacy, gaming, legal, false confidence)?
(For private/company data, my assumption is sanitized extraction before analysis.)
1 comments