Description:
The work we do at Autodesk touches nearly every person on the planet. By creating software tools for making buildings, machines, and even the latest movies, we influence and empower some of the most creative people in the world. As an AI Research 3D Dataset Creation & Annotation Manager at Autodesk Research, you will work side-by-side with world-class researchers and engineers to build and label datasets to train new ML-powered product features to help our customers imagine, design, and make a better world.
You are passionate about solving problems and building things. You are excited to collaborate with researchers and engineers to bring the ML models they develop into Autodesk products.
You will report to the Senior Director of AI Research in Autodesk Research. We are a global team, located in London, San Francisco, Toronto, and remotely. For this role we support both in-person, hybrid, and remote work.
Responsibilities
- Development of specialized datasets to evaluate, fine-tune, and train large 2D & 3D models for Design & Make
- Oversee relationships with labeling vendors and contractors, managing data collection budgets and projects end-to-end
- Drive data collection projects from start to finish by gathering requirements, defining success criteria, and adjusting to the dynamic requirements of AI Research
- Manage teams of AI Data Research Scientists, engineers, and labelling contractors to collect, review, and deliver high-quality data including 3D data.
- Apply a high degree of operational rigor and reporting to be excellent cross-functional partners to our legal, finance, and Trusted AI teams
- Achieve the highest quality data possible, while doing so in an increasingly efficient manner
- Design and write clear labeling and annotation guidelines for human labeling tasks
- Spearhead efforts to develop tooling and process improvements, optimizing for quality, throughput, and usability of datasets for AI training
- Partner with cross-functional teams to proactively identify data requirements of AI Research projects and products
- Are an expert in managing data labeling operations, particularly for dynamic and nondeterministic labels
- Are an exceptionally clear communicator who can anticipate the needs of our vendors, researchers, and internal and external partners
- Understand the intricacies of collecting data from human annotators/ labelers for training or evaluating AI models
- Enjoy problem-solving, both technically and creatively, using your technical and non-technical skills
- Enjoy getting into the weeds of data and forming bottom-up assessments on edge cases and quality deficiencies
- Are comfortable using a combination of internal & external tools and exhibit exceptional judgment in building repeatable processes
- Are deeply adaptable to projects in varying domains and complexities, and effectively reason about systems and processes for domains even beyond your core expertise
- Have an action-oriented and deeply curious mind
- Are interested in developing new processes and finding new efficiencies by combining VLMs and other automatic labeling techniques with human labeling
- Can be an effective thought partner to our researchers, collaborating on how best to achieve their goals
- Thrive in dynamic environments and can pivot quickly when faced with changing research requirements or edge cases
Minimum Qualifications
- 3+ years of experience managing data collection projects for industry-scale AI model training, fine-tuning, and evaluation
- Experience creating & curating large-scale datasets (10^6+ data points) and solving associated technical challenges
- Experience in managing cross team data curation efforts, coordinating/managing internal tool-chains and software instrumentations for annotation the occur
- Experience in managing human resources for the annotation efforts (either in-house or 3rd party)
- Experience with managing budget allocations for big-scale annotations
- Strong AI Research background with data-efficient ML experience
- Experience managing data labeling operations
- Familiarity with tools like AWS, SQL, parallel computing, Apache Spark, or similar for data management
- Experimental design with human subjects experience
- Excellent communication skills and the ability to collaborate with diverse teams
- Nice to have: Experience with models fine-tuned on human data (e.g. RLHF)
- Nice to have: The ability to debug and contribute to our annotation tools when appropriate, though this will not be the main focus of the role
- Nice to have: Knowledge of the design, manufacturing, AEC, or media & entertainment industries