aitrainer.work - AI Training Jobs Platform
Data Science alignerr

Robotics ML Expert — MuJoCo & Reinforcement Learning

Alignerr Remote Posted 4 days ago

Education

Any

Type

Pay Rate

$125/task

Posted

4d ago

✅ Applying through this link gives you a verified candidate referral.

Referrals from verified candidates give your profile a visibility boost and help support our platform at no cost to you.

This position is hosted on an external talent platform. Please only apply for this position if it fits your skills and interests.

Apply Now

About this Role

What You'll Do

  • Design, develop, and iterate on MuJoCo simulation environments for robotics research and AI training
  • Implement and tune reinforcement learning algorithms (PPO, SAC, TD3, etc.) to train agents in simulated tasks
  • Define reward functions, observation spaces, and action spaces that produce robust, transferable policies
  • Debug and optimize physics simulations — contact models, actuator dynamics, and scene configurations
  • Evaluate trained policies for stability, generalization, and sim-to-real transfer potential
  • Document environment specifications, training procedures, and experimental results clearly and thoroughly
  • Collaborate asynchronously with research teams to align simulation work with broader project goals
  • Stay current with the latest advances in robot learning, simulation, and embodied AI

About the Role

What if your expertise in robotics and machine learning could directly shape how the next generation of intelligent agents learn to move, manipulate, and interact with the physical world? We're looking for Robotics ML Experts in Munich's world-class robotics community with hands-on MuJoCo experience to design, build, and refine simulation environments that train AI systems to perform real-world tasks — from locomotion and dexterous manipulation to complex multi-agent coordination. This is a fully remote, flexible contract role for experienced practitioners who live and breathe physics simulation, reinforcement learning, and robot control. If you've spent time wrangling MJCF files, tuning reward functions, and debugging contact dynamics, this role was made for you.

  • Organization: Alignerr
  • Type: Hourly Contract
  • Location: Remote
  • Commitment: 10–40 hours/week

Who You Are

  • Strong hands-on experience with MuJoCo (or MuJoCo via dm_control, Gymnasium/Gymnasium-Robotics, or similar wrappers)
  • Solid understanding of reinforcement learning theory and practical training pipelines
  • Proficient in Python and comfortable with ML frameworks such as PyTorch or JAX
  • Experienced in defining and shaping reward functions for complex robotic tasks
  • Familiar with robot kinematics, dynamics, and control fundamentals
  • Able to read and write MJCF/XML model files and understand their physics implications
  • Self-directed, detail-oriented, and comfortable working independently in an async environment
  • Strong written communicator who can document technical work clearly

Nice to Have

  • Experience with sim-to-real transfer techniques (domain randomization, system identification)
  • Familiarity with other physics simulators — Isaac Gym, PyBullet, Drake, or Genesis
  • Background in multi-agent environments or hierarchical RL
  • Published research or open-source contributions in robotics, RL, or embodied AI
  • Experience with imitation learning, model-based RL, or world models
  • Graduate-level coursework or degree in robotics, ML, computer science, or a related field

Why Join Us

  • Work on cutting-edge robotics and AI simulation projects alongside leading research labs
  • Fully remote and flexible — work when and where it suits you
  • Freelance autonomy with the structure of meaningful, milestone-driven work
  • Directly influence how AI agents learn to interact with the physical world
  • Engage with a global community of top-tier ML and robotics practitioners
  • Potential for ongoing work and contract extension as new projects launch

Requirements

  • Fluent proficiency in English (Written & Verbal)
  • Reliable high-speed internet connection
  • Bachelor's degree or equivalent professional experience
  • Demonstrated expertise in Data Science

Compensation Analysis

What if your expertise in robotics and machine learning could directly shape how the next generation of intelligent agents learn to move, manipulate, and interact with the physical world? We're looking for Robotics ML Experts in Munich's world-class robotics community with hands-on MuJoCo experience to design, build, and refine simulation environme

Skills & Categories

Explore other opportunities in related specializations:

Related Jobs

Alignerr

Browse All Jobs from Alignerr

Discover more opportunities on Alignerr that match your skills and interests.

View All Alignerr Jobs →

Community Reviews

Loading reviews…

Frequently Asked Questions

What is the assessment actually like?

Notoriously strict. Alignerr uses TestGorilla for role-specific timed tests — a blank coding environment for engineers, rigorous grammar and fact-checking for writers. There is almost no hand-holding. The critical catch: this is essentially a one-shot process. Fail or abandon the assessment, and you are typically locked out of that role permanently with no option to retake.

How quickly can I start earning after I pass?

Not immediately. Even after passing the assessment and completing identity verification (via Persona) and billing setup (via Deel), you may sit in a waiting pool for weeks or months. You only start earning when a project matching your specific skills launches and you are officially assigned. Do not plan around Alignerr income until you are actively on a project.

Is there a community?

Yes — and it is one of Alignerr's genuine strengths. Once assigned to a project, you are added to Slack channels where you can ask questions, get rubric clarifications from admins, and talk to other AI trainers. This is rare in AI training and makes a real difference when guidelines are ambiguous or change mid-project.

Is this traditional consulting?

Not exactly. You act as a "Teacher" for advanced AI. Instead of client deliverables, you are given complex scenarios to evaluate. You grade the AI's logic, correct its hallucinations, and provide expert-level reasoning. Your job is to train the model to think like you do.

Why is the pay so high?

This role requires deep, verified expertise. General knowledge isn't enough; the model is specifically being trained on "edge cases"—the rare, difficult, or highly technical nuances that only a senior professional would know.

What is the workload like?

This is cognitive, deep work. Unlike simple data labeling, you might spend 45-60 minutes on a single task, researching citations or verifying complex calculations. Quality is prioritized over speed.

What is the barrier to entry?

Alignerr is known for difficult technical assessments. You must pass a timed test in your specific domain (e.g., Python, Physics, or Language) before you are eligible for any paid projects.