Software Engineering & DevOps AI Rater/Evaluator

LILT AI

Ras Al Khaimah, UAEAED 8,000-20,000/moToday

UAEIT & TechnologyFull Time

Skills Required

ExcelDevopsErpSafetyEnglish

Job Description

OverviewLILT is building a global network of domain experts to support high‑quality AI evaluation across training, benchmarking, red‑teaming, and ongoing model monitoring. We are seeking software engineering and DevOps professionals to contribute expert judgment to human‑in‑the‑loop AI evaluation workflows used by leading enterprises and hyperscalers.This role is designed for professionals who understand how software systems, infrastructure, and development practices work in real production environments and who can apply that expertise to evaluate, assess, and improve multilingual AI systems. Your contribution of expertise will directly influence multilingual AI model quality, safety, and deployment readiness.Track A: Software Engineering & DevOps AI RaterRaters execute structured evaluation tasks using clearly defined rubrics and instructions.ResponsibilitiesEvaluate AI outputs related to software engineering, DevOps, and infrastructure topicsPerform structured scoring, comparison, classification, and judgment tasksAssess technical correctness, completeness, security implications, and best‑practice alignmentIdentify hallucinations, incorrect code, unsafe recommendations, or misleading system guidanceApply domain‑specific engineering and DevOps guidelines consistently across tasksIdeal BackgroundSoftware engineers, site reliability engineers, DevOps engineers, or platform engineersExperience with production systems, CI/CD pipelines, cloud infrastructure, or distributed systemsStrong attention to detail and comfort working with structured evaluation criteriaTrack B: Software Engineering & DevOps AI Evaluator (Senior Track)Evaluators provide higher‑level technical oversight and help shape how evaluation is performed.ResponsibilitiesValidate and refine evaluation rubrics and edge‑case handlingPerform adjudication where raters disagreeConduct error analysis and qualitative reviews of model behaviorPartner with LILT research, product, and customer teams on evaluation designSupport red‑teaming, security review, and model readiness assessmentsIdeal BackgroundSenior software engineers, DevOps leads, SREs, or technical architectsExperience defining technical standards, reviewing complex edge cases, or advising on system design and reliabilityAbility to clearly explain nuanced technical reasoning and tradeoffsEvaluation Focus & RequirementsTypes of AI Evaluation Work based on project demands:Software engineering and infrastructure content evaluationCode correctness and reasoning assessmentDevOps, CI/CD, and cloud architecture evaluationSecurity and reliability‑focused red‑teamingOngoing model monitoring and regression testingWhat We Look ForDeep domain expertise in software engineering, DevOps, or infrastructureStrong technical judgment and ability to apply criteria consistentlyComfort working with structured evaluation workflowsAbility to explain reasoning clearly, especially in complex or high‑risk technical scenariosReliability, professionalism, and respect for quality standardsEngagement ModelContract‑based, flexible participationProject‑based work with clear expectations and timelinesOpportunities for recurring work based on performance and demandCompensation communicated upfront per project or task typeWhy This Work MattersProvide accurate and safe technical guidanceAlign with real‑world engineering and DevOps best practicesAre reliable, secure, and trustworthy across languagesLanguage RequirementsNative or professional fluency in one or more supported languages is requiredSupported languages span 30+ global languagesLanguage‑specific nuance is assessed through screening and task‑based evaluation, not separate job descriptionsEnglish fluency is required for guidelines, feedback, and collaborationLILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human‑verified services to Enterprises, Governments, and AI Developers around the world.Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at https://lilt.com/legal/privacy.LILT is committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision‑making and help us identify qualified candidates efficiently and objectively. All final hiring de