Kỹ Sư Nền Tảng Dữ Liệu CNTT

TP. Ho Chi Minh
Đăng ngày 19/03/2026

Actions

Ứng tuyển

Chi tiết Kỹ Sư Nền Tảng Dữ Liệu CNTT

KEY RESPONSIBILITIES

ML Data Platform Engineering & Administration

  • Install, configure, and maintain ML data platforms on top of Kubernetes, Object Storage, Cassandra, Postgres and related technologies
  • Monitor platform performance and optimize as needed for reliability and efficiency

Platform Configuration and Maintenance

  • Implement and manage platform configurations, ensuring adherence to best practices and security standards
  • Regularly update and patch systems to maintain security and stability

Collaborate with Cross-Functional Teams

  • Work closely with ML and data engineers and other roles to align IT needs and strategies
  • Provide expert guidance on ML data platform best practices and optimization

Troubleshoot and Resolve Technical Issues

  • Identify, diagnose, and resolve data platform problems in a timely manner
  • Escalate complex issues to upper-level support when necessary

Backup and Recovery Management

  • Implement and maintain backup and recovery strategies for data platforms, ensuring data integrity and availability

Maintain and Update Documentation

  • Create and maintain documentation related to data platform administration, configuration, and maintenance
  • Share knowledge with team members and contribute to a culture of continuous learning

Enhance Data Security and Compliance

  • Ensure data platforms adhere to security best practices and comply with relevant regulations
  • Stay up-to-date on industry trends and evolving security standards

Drive Continuous Improvement

  • Evaluate and implement new technologies and techniques to enhance data platform performance and administration
  • Proactively identify areas for improvement, prepare plan for implementation and get support from management and development teams
  • Always prefer automation and code-first approach over hard to reproduce manual tasks

Yêu cầu Kỹ Sư Nền Tảng Dữ Liệu CNTT

REQUIREMENTS

  • A degree in computer science, software engineering, information technology or related fields is preferred
  • Production use experience with AI agents (like Langchain, Agno) and LLM stack (like KServe, vLLM, pgvector)
  • At least 3 years of experience in data platform (Kafka, Spark, Airflow, Flink, KServe, MLFlow, Lakekeeper) and related infrastructure installation, administration, patching and automation (Linux, Ansible, Helm, Terraform, Kubernetes, Ceph)
  • Continuous improvement mindset covering daily operations, stability, reliability and performance of ML data platforms
  • Knowledge of key concepts like infrastructure-as-code, templates, playbooks, code versioning using GIT, CI/CD automation, high availability, disaster and recovery
  • Proficient in Linux Shell, Python scripting, configuration using JSON, YAML files
  • Ability to troubleshoot infrastructure, analyze logs and setup/update monitoring dashboards and metrics, describe and document root cause, attend retrospective meetings
  • Understand IT systems documentation, data flow diagrams, integration diagrams and related terminology (UML)
  • Understand major ML data platforms concepts – LLM, RAG, agent, quantization, fine-tuning, Data Lake, ACID, Distributed Query, Feature Store, Data Governance, Data Catalogue, Streaming, Batch , Parquet
  • Proficient in using documentation (Markdown, Visio, Office 365) and communication tools (MS Teams)
  • Proficient user of change management and support ticket/service desk tools (JIRA SD)


COMPENSATION & BENEFITS

  • 13th Salary Fixed and KPI Bonus
  • Premium Health Care program
  • 24/7 Accidental Insurance
  • 100% Social Insurance
  • Meal + Phone Allowance
  • 15 Annual Leaves
  • Yearly Medical Checkup
  • Professional and Transparent Working Environment
  • Apply Latest Financial Technology in the World

If you are referred for this position by our Employee/Recruitment Collaborator, please apply via this LINK.

If you are an internal candidate, please apply via this LINK

Otherwise, please click the Apply button as below for application