Data Architect
๐ฉโ๐ป Data Architect/ Engineering lead – ML
๐ AI Drug Discovery Unicorn (Series B)
๐Remote within US
๐ธ Competitive base + hefty equity
ย
This start-up is truly a leader in deep learning & generative AI, merging cutting-edge models with biology research to accelerate disease treatment & drug discovery.
ย
They are seeking an experienced, self-driven Data Architect/ Engineer Lead – ML (note: final level determined through interview process) to join their fast-growing AI/ ML team. This Lead will design and develop data model/ lakehouse and work alongside ML architects & scientists to scale data curation and pipelines for ML on a modern tech stack that enables the productization of unique phenotypic, target-agnostic drug discovery platforms
ย
Responsibilities:
- Design, implement and support data lakehouse infrastructure using modern cloud-service stack (AWS preferred): Python, S3, Batch, Lambda, EKS, IAM, Rest (Redshift, Glue, Athena, ECR, Parquet is a plus)
- Develop ETLs and real-time data pipelines to source & curate data from internal and public data sources (experience with biopharma data: image, omics, molecular is a plus)
- Ownership of end-to-end data model (for structured and unstructured data) for ML training & inference and statistical analysis
ย
Requirements:
- Bachelorโs degree (Masterโs preferred, or equivalent years of industry experience) in computer science, engineering, analytics, mathematics, statistics or equivalent
- Deep expertise working with large data sets, data visualization, building complex data processes, performance tuning, bringing data from disparate data stores and programmatically identifying patterns to optimize ML utilization
- 5+ years of software development experience working on large scale cloud-based services & data environment:
- o Python for data modeling, warehousing and ETL
- o Relational SQL and NoSQL databases (experience with large non-relational DBs/ stores: object, graph, columnar DBs/ stores are a plus)
- o Automated build processes with CI/CD in cloud, cluster & workload management
- o Adherence to production environments (agile, regression testing, version control)
- o Familiarity with big and real-time data governance and ML workflow orchestration (experience with Spark, Databricks, MLflow is a plus)
ย
Bonus qualifications:
- Experience developing software components in a start-up environment
- Experience in biotech and drug discovery
- A self-reliant problem-solver who also excels in teamwork, characterized by strong data-driven, first principled decision-making and superb communication skills
ย
๐ Remote within the USA
๐ง Interested in applying? Please click on the โEasy Applyโ button or alternatively email me your resume at stefani.lukic@storm3.com
ย
โก Storm3 is a HealthTech recruitment firm with clients across major Tech hubs in Europe, APAC and North America. To discuss open opportunities or career options, please visit our website at storm3.com and follow the Storm3 LinkedIn page for the latest jobs and intel