Data Science alignerr

Machine Learning Evaluation Specialist

Alignerr • Remote

Education

Any

Type

Pay Rate

$300/task

Listed

72d ago

✅ Applying through this link gives you a verified candidate referral.

Referrals from verified candidates give your profile a visibility boost and help support our platform at no cost to you.

This position is hosted on an external talent platform. Please only apply for this position if it fits your skills and interests.

Apply Now →

About this Role

What You'll Do

Design complex, original machine learning problems rooted in your area of domain expertise
Create evaluation tasks that demand advanced knowledge well beyond standard ML pipelines
Draw from your own research experience to craft challenges that genuinely test highly capable AI models
Write clear problem statements, define evaluation criteria, and establish gold-standard solutions
Assess AI-generated solutions for correctness, creativity, and methodological rigor
Document problem difficulty, required domain knowledge, and expected failure modes
Collaborate asynchronously with a global team of researchers and engineers

About the Role

What if your years of hard-earned research expertise could directly shape the future of AI? We're looking for domain experts with deep machine learning knowledge to design evaluation challenges that push state-of-the-art AI systems to their limits — the kind of problems only a true specialist could craft. Your work won't sit in a drawer. It directly influences how the next generation of AI models are measured, trained, and improved.

Organization: Alignerr
Type: Hourly Contract
Location: Remote
Commitment: 10–40 hours/week

Who You Are

Graduate-level expertise (MS or PhD preferred) in a scientific or technical discipline that intersects with machine learning
Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design
Deep familiarity with active, open research problems in your field
A sharp eye for where general ML knowledge breaks down and specialized domain insight becomes essential
Experience publishing or conducting original research is highly valued
Excellent written communication — you can articulate complex, nuanced problems with precision and clarity
Self-motivated and energized by intellectually demanding, independent work

Example Domains

We welcome experts from a wide range of fields, including but not limited to: If your domain sits at the frontier of ML research, we want to hear from you.

Computational biology, genomics, or bioinformatics
Climate science and environmental modeling
Medical imaging and healthcare ML
Materials science and computational chemistry
Astrophysics and signal processing
Natural language processing for low-resource or specialized corpora
Robotics, control theory, or reinforcement learning in complex environments
Financial modeling and quantitative analysis

Why Join Us

Work at the cutting edge — your challenges help define the boundaries of what AI can and cannot do
Make a real impact — your expertise directly shapes AI safety and evaluation research
Full autonomy — work on your own schedule, from anywhere in the world
Flexible commitment — scale hours up or down based on your availability
Ongoing opportunity — strong contributors are considered for contract extensions and deeper research involvement
Build your profile — establish yourself as a contributor to frontier AI development alongside top research labs

Requirements

Fluent proficiency in English (Written & Verbal)
Reliable high-speed internet connection
Bachelor's degree or equivalent professional experience
Demonstrated expertise in Data Science

Compensation Analysis

Skills & Categories

Explore other opportunities in related specializations:

Data Science Expert

Related Jobs

Statistician — Time-Series Insights Writer (AI Training)

sme_careers • Data Science

$30

Python (ML-Focused) Quality Assurance Lead (QAL)

sme_careers • Data Science

$120

Data Scientist Quality Assurance Lead (QAL)

sme_careers • Data Science

$110

Data Scientist Quality Assurance Lead (QAL)

sme_careers • Data Science

$110

Browse All Jobs from Alignerr

Discover more opportunities on Alignerr that match your skills and interests.

View All Alignerr Jobs →

Community Reviews

Loading reviews…

💬

Share your experience with Alignerr

Help other candidates make better decisions by leaving a review.

Frequently Asked Questions

What is the assessment actually like?

Notoriously strict. Alignerr uses TestGorilla for role-specific timed tests — a blank coding environment for engineers, rigorous grammar and fact-checking for writers. There is almost no hand-holding. The critical catch: this is essentially a one-shot process. Fail or abandon the assessment, and you are typically locked out of that role permanently with no option to retake.

How quickly can I start earning after I pass?

Not immediately. Even after passing the assessment and completing identity verification (via Persona) and billing setup (via Deel), you may sit in a waiting pool for weeks or months. You only start earning when a project matching your specific skills launches and you are officially assigned. Do not plan around Alignerr income until you are actively on a project.

Is there a community?

Yes — and it is one of Alignerr's genuine strengths. Once assigned to a project, you are added to Slack channels where you can ask questions, get rubric clarifications from admins, and talk to other AI trainers. This is rare in AI training and makes a real difference when guidelines are ambiguous or change mid-project.

Is this traditional consulting?

Not exactly. You act as a "Teacher" for advanced AI. Instead of client deliverables, you are given complex scenarios to evaluate. You grade the AI's logic, correct its hallucinations, and provide expert-level reasoning. Your job is to train the model to think like you do.

Why is the pay so high?

This role requires deep, verified expertise. General knowledge isn't enough; the model is specifically being trained on "edge cases"—the rare, difficult, or highly technical nuances that only a senior professional would know.

What is the workload like?

This is cognitive, deep work. Unlike simple data labeling, you might spend 45-60 minutes on a single task, researching citations or verifying complex calculations. Quality is prioritized over speed.

What is the barrier to entry?

Alignerr is known for difficult technical assessments. You must pass a timed test in your specific domain (e.g., Python, Physics, or Language) before you are eligible for any paid projects.

Machine Learning Evaluation Specialist

About this Role

What You'll Do

About the Role

Who You Are

Example Domains

Why Join Us

Requirements

Compensation Analysis

Skills & Categories

Related Jobs

Statistician — Time-Series Insights Writer (AI Training)

Python (ML-Focused) Quality Assurance Lead (QAL)

Data Scientist Quality Assurance Lead (QAL)

Data Scientist Quality Assurance Lead (QAL)

Browse All Jobs from Alignerr

Community Reviews

Leave your review

Frequently Asked Questions

$150–$225/hr. Lawyers, MDs and Finance Experts Wanted.

Get Paid for the Expertise You Already Have

AI Trainer? Don't Let the IRS Keep Your Bonus

Fight AI with AI

No Projects Available?

Fight AI with AI

No Projects Available?

AI Trainer? Don't Let the IRS Keep Your Bonus

Fight AI with AI