Job Description
Responsibilities The Data Platform Global Live team is dedicated to empowering the growth of TikTok LIVE business through big data. We support our businesses in achieving their missions by building high quality real-time and offline data warehouses, creating various forms of efficient and data-friendly data assets, and exploring and implementing business oriented data solutions. We provide stable and reliable data capabilities for daily operations, analyses, decision-making of TikTok LIVE features, in addition to robust data support to enhance live performance for streamers. We are looking for talented individuals to join our team in 2027. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at our Company. Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume. We are building a next-generation enterprise knowledge system for the LLM era. Our goal is to enable large language models to understand, access, and operate on enterprise data, including data warehouses, documents, logs, and real-time streams. This role focuses on designing and researching a unified knowledge layer that supports query, reasoning, and execution, integrating RAG, knowledge graphs, and agent-based systems. You will work at the intersection of data infrastructure, AI systems, and knowledge modeling, and help define how AI interacts with enterprise data. Topic Introduction This project focuses on building a unified knowledge system for the era of large language models. It aims to enable LLMs to efficiently access and understand both structured and unstructured enterprise data, including data warehouses, documents, logs, and real-time information. By integrating Retrieval-Augmented Generation (RAG), knowledge graphs, and agent capabilities, the project seeks to develop an intelligent system that supports querying, reasoning, and execution across key business scenarios such as analytics, decision-making, and automation. Challenges Deploying LLMs in enterprise scenarios presents several challenges. Heterogeneous data sources are fragmented and lack a unified modeling and access framework. Knowledge updates are often delayed, making it difficult to meet real-time requirements. In addition, LLMs may produce hallucinations due to weak grounding, requiring reliable citation and verification mechanisms. Balancing performance, cost, and latency, while designing a scalable and extensible knowledge integration and orchestration framework, remains a core challenge. Value This project provides foundational infrastructure for scaling LLM applications in enterprise environments, significantly improving output accuracy, interpretability, and business usability. By establishing a unified knowledge operation layer, it helps consolidate core data assets and build sustainable competitive advantages. It also accelerates the evolution of AI from conversational tools to data-driven decision agents, laying the groundwork for next-generation data + AI platforms. What You Will Do Research and design unified knowledge representations for enterprise data Explore and build RAG-based knowledge systems with high accuracy and low latency Develop ontology / semantic layers to bridge data and LLM understanding Design knowledge ingestion and update mechanisms (batch + real-time) Improve LLM grounding, traceability, and reliability Explore agent-based reasoning and execution frameworks Prototype and validate new ideas, and bring them into production systems Qualifications Minimum Qualifications: Individuals who are completing or have recently completed a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline. Strong programming skills in Python / Java / Scala Solid understanding of data systems, data modeling, or distributed systems Experience in at least one of the following: Data engineering/backend systems Machine learning/LLM systems Strong problem-solving skills and curiosity about new technologies Preferred Qualifications Experience with LLM, RAG, or vector databases Knowledge of knowledge graphs or ontology modeling Experience with real-time data processing (Flink, Kafka, etc.) Understanding of AI agents or workflow orchestration Experience building data platforms or knowledge systems Job Information 【For Pay Transparency】Compensation Description (Annually) The base salary range for t