Software Engineer – AI Evaluation Analyst
Alignerr • Remote • Posted 26 days ago
Education
Any
Type
Pay Rate
$75/task
Posted
26d ago
✅ Applying through this link gives you a verified candidate referral.
Referrals from verified candidates give your profile a visibility boost and help support our platform at no cost to you.
This position is hosted on an external talent platform. Please only apply for this position if it fits your skills and interests.
About this Role
What You'll Do
- Evaluate the performance of frontier AI language models on complex, real-world software engineering tasks
- Hunt for bugs, logical errors, hallucinations, and reliability failures in AI-generated code
- Design prompts, test cases, and evaluation scenarios that push models to their limits
- Write precise, structured feedback explaining model strengths, weaknesses, and edge case behavior
- Work across multiple programming languages and codebases to assess correctness and generalization
- Think critically about model behavior — not just whether code runs, but whether it's right
About the Role
What if your years of engineering experience could directly influence how the world's most advanced AI systems write and reason about code? We're looking for experienced software engineers to put frontier AI models through their paces — finding the bugs, hallucinations, and logic failures that only a sharp, seasoned developer would catch. This is a fully remote, flexible contract role. No background in AI research required — just deep coding expertise, strong analytical thinking, and a genuine instinct for finding what breaks.
- Organization: Alignerr
- Type: Hourly Contract
- Location: Remote
- Commitment: 10–40 hours/week
Who You Are
- 3+ years of professional software engineering experience
- Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
- Excellent written and spoken English — you can explain complex technical issues with clarity
- Demonstrated ability to reason about complex systems and debug non-obvious failures
- Comfortable with modern development tooling — Git, CLI workflows, testing frameworks, and IDEs
- A critical thinker who evaluates code quality independently rather than deferring to model outputs
Nice to Have
- Experience across multiple programming languages or paradigms
- Background in code review, software testing, or QA engineering
- Familiarity with large language models or AI-generated code evaluation
- Exposure to prompt engineering or adversarial testing
Why Join Us
- Work on cutting-edge AI projects alongside leading research labs
- Fully remote and flexible — work when and where it suits you
- Freelance autonomy with the structure of meaningful, task-based work
- Make a direct, tangible impact on how AI understands and generates software
- Potential for ongoing work and contract extension as new projects launch
Requirements
- Fluent proficiency in English (Written & Verbal)
- Reliable high-speed internet connection
- Bachelor's degree or equivalent professional experience
- Demonstrated expertise in Software Engineering
Eligible Languages
Fluent proficiency in English
Compensation Analysis
What if your years of engineering experience could directly influence how the world's most advanced AI systems write and reason about code? We're looking for experienced software engineers to put frontier AI models through their paces — finding the bugs, hallucinations, and logic failures that only a sharp, seasoned developer would catch. This is a
Skills & Categories
Explore other opportunities in related specializations:
Related Jobs
Browse All Jobs from Alignerr
Discover more opportunities on Alignerr that match your skills and interests.
View All Alignerr Jobs →Community Reviews
Leave your review
Frequently Asked Questions
What is the assessment actually like?
Notoriously strict. Alignerr uses TestGorilla for role-specific timed tests — a blank coding environment for engineers, rigorous grammar and fact-checking for writers. There is almost no hand-holding. The critical catch: this is essentially a one-shot process. Fail or abandon the assessment, and you are typically locked out of that role permanently with no option to retake.
How quickly can I start earning after I pass?
Not immediately. Even after passing the assessment and completing identity verification (via Persona) and billing setup (via Deel), you may sit in a waiting pool for weeks or months. You only start earning when a project matching your specific skills launches and you are officially assigned. Do not plan around Alignerr income until you are actively on a project.
Is there a community?
Yes — and it is one of Alignerr's genuine strengths. Once assigned to a project, you are added to Slack channels where you can ask questions, get rubric clarifications from admins, and talk to other AI trainers. This is rare in AI training and makes a real difference when guidelines are ambiguous or change mid-project.
What does the work actually look like?
It is practical, hands-on data work. You might be recording short videos, categorizing images, rating text responses, or analyzing data. The tasks are designed to be short and distinct—typically 5-60 minutes per task.
How flexible is the schedule?
Extremely. This is true "log in and work" flexibility. You can usually work for 20 minutes or 4 hours depending on your availability. There are rarely minimum hour requirements, making it ideal for side income.
Is there an interview?
Usually, no. Hiring for these roles is almost entirely based on passing an automated assessment or "qualification" task. If you pass the test, you get access to the work.
What is the barrier to entry?
Alignerr is known for difficult technical assessments. You must pass a timed test in your specific domain (e.g., Python, Physics, or Language) before you are eligible for any paid projects.