aitrainer.work - AI Training Jobs Platform
STEM alignerr

Python Insfrastructure Engineer - Model Evaluation

Alignerr β€’ Remote β€’ Posted 0 days ago

Education

Any

Type

Pay Rate

$62.5/task

Posted

0d ago

βœ… Applying through this link gives you a verified candidate referral.

Referrals from verified candidates give your profile a visibility boost and help support our platform at no cost to you.

This position is hosted on an external talent platform. Please only apply for this position if it fits your skills and interests.

Apply Now β†’

About this Role

What You'll Do

  • Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation workflows
  • Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control
  • Build and maintain evaluation harnesses that integrate with ML inference frameworks
  • Improve reliability, performance, and safety across existing Python codebases
  • Instrument systems with observability and metrics collection to monitor reliability and model performance
  • Identify bottlenecks and edge cases in data and system behavior, and implement scalable fixes
  • Collaborate with data, research, and engineering teams to support model training and evaluation workflows
  • Participate in synchronous design reviews to iterate on architecture and implementation decisions

About the Role

What if your Python expertise could directly shape how the world's most advanced AI models are built, tested, and improved? We're looking for a senior Python engineer to design and build the data pipelines, evaluation harnesses, and annotation tooling that sit at the heart of cutting-edge AI development. This is a fully remote, flexible contract role working alongside leading AI research labs on real production systems. If you're a strong Python engineer who wants to do meaningful, high-impact work at the frontier of AI β€” this is the role for you.

  • Organization: Alignerr
  • Type: Hourly Contract
  • Location: Remote
  • Commitment: 20–40 hours/week

Who You Are

  • Native or fluent English speaker with clear written and verbal communication skills
  • Full-stack developer with a strong systems programming background
  • 3–5+ years of professional experience writing production-grade Python
  • Experienced building evaluation harnesses for ML models and integrating with inference frameworks
  • Solid background in observability, metrics collection, and monitoring for production systems
  • Self-motivated and reliable β€” able to commit 20–40 hours per week

Nice to Have

  • Prior experience with data annotation, data quality, or evaluation systems
  • Familiarity with AI/ML workflows, model training, or benchmarking pipelines
  • Experience with distributed systems or developer tooling
  • Background in MLOps or AI infrastructure

Why Join Us

  • Work directly on cutting-edge AI projects alongside leading research labs
  • Fully remote and flexible β€” structure your work week around your life
  • Freelance autonomy with the depth and consistency of meaningful, long-term technical work
  • Make a tangible impact on how next-generation AI models are evaluated and improved
  • Potential for ongoing work and contract extension as new projects launch

Requirements

  • Fluent proficiency in English (Written & Verbal)
  • Reliable high-speed internet connection
  • Bachelor's degree or equivalent professional experience
  • Demonstrated expertise in STEM

Eligible Languages

Fluent proficiency in English

English

Compensation Analysis

What if your Python expertise could directly shape how the world's most advanced AI models are built, tested, and improved? We're looking for a senior Python engineer to design and build the data pipelines, evaluation harnesses, and annotation tooling that sit at the heart of cutting-edge AI development. This is a fully remote, flexible contract ro

Skills & Categories

Explore other opportunities in related specializations:

Related Jobs

Alignerr

Browse All Jobs from Alignerr

Discover more opportunities on Alignerr that match your skills and interests.

View All Alignerr Jobs β†’

Community Reviews

Loading reviews…

Frequently Asked Questions

What is the assessment actually like?

Notoriously strict. Alignerr uses TestGorilla for role-specific timed tests β€” a blank coding environment for engineers, rigorous grammar and fact-checking for writers. There is almost no hand-holding. The critical catch: this is essentially a one-shot process. Fail or abandon the assessment, and you are typically locked out of that role permanently with no option to retake.

How quickly can I start earning after I pass?

Not immediately. Even after passing the assessment and completing identity verification (via Persona) and billing setup (via Deel), you may sit in a waiting pool for weeks or months. You only start earning when a project matching your specific skills launches and you are officially assigned. Do not plan around Alignerr income until you are actively on a project.

Is there a community?

Yes β€” and it is one of Alignerr's genuine strengths. Once assigned to a project, you are added to Slack channels where you can ask questions, get rubric clarifications from admins, and talk to other AI trainers. This is rare in AI training and makes a real difference when guidelines are ambiguous or change mid-project.

Is this just labeling data?

No. This is closer to academic research. You will likely be writing or verifying complex proofs, solving advanced equations, or checking the logic of a model's step-by-step reasoning. The goal is to teach AI systems to reason deeply in your field.

Do I need a PhD?

For the highest pay tiers in this category, a PhD (or current enrollment) is usually expected. However, the most important factor is your ability to pass the domain assessment. If you can solve the problems, the degree is secondary.

Is the work continuous?

Work in niche fields is often project-based. A specific "campaign" (e.g., training a model on Quantum Mechanics) might last for a few weeks. It is best to treat this as a high-paying fellowship or grant rather than a permanent daily job.

What is the barrier to entry?

Alignerr is known for difficult technical assessments. You must pass a timed test in your specific domain (e.g., Python, Physics, or Language) before you are eligible for any paid projects.