AI Glossary: 49 Key Artificial Intelligence Terms

The rubythalib.ai AI Glossary: the most important artificial intelligence terms with short, clear definitions. From machine learning, LLMs, and computer vision to AI implementation terms for business.

AI Fundamentals

Artificial Intelligence (AI)

Artificial Intelligence is the field of computer science focused on building machines that can perform tasks normally requiring human intelligence, such as recognizing images, understanding language, making decisions, and predicting outcomes from data.

Why it matters: AI is the umbrella over every other term in this glossary. For businesses, it means automating work and making faster, data-driven decisions.

Machine Learning (ML)

Machine Learning is a branch of AI where systems learn patterns from data instead of being programmed with explicit rules. The more relevant data it sees, the better the model becomes at recognizing patterns and making predictions.

Why it matters: Most practical enterprise AI, from sales forecasting to anomaly detection, is really an application of Machine Learning.

Deep Learning

Deep Learning is a type of Machine Learning that uses multi-layered artificial neural networks (deep neural networks) to learn complex patterns. It is the backbone of modern Computer Vision and language models.

Why it matters: Nearly every major AI breakthrough, from image recognition to LLMs, is powered by Deep Learning because it handles highly complex data.

Neural Network (Jaringan Saraf Tiruan)

A Neural Network is a computational model inspired by the brain, made of layers of interconnected nodes (neurons). Each connection has a weight that is adjusted as the model learns from data.

Why it matters: Neural networks are the basic building block behind Deep Learning. Understanding them explains why AI models need lots of data and compute to train.

Supervised Learning

Supervised Learning is an ML approach where a model is trained on labeled data, meaning data that already has the correct answers. The model learns to map inputs to the right outputs, for example a photo to an object name.

Why it matters: It is the most common approach in business projects: document classification, defect detection, and churn prediction all rely on labeled data.

Unsupervised Learning

Unsupervised Learning is an ML approach where the model finds patterns or groups within data without labeled answers, for example clustering customers by shopping behavior without predefined categories.

Why it matters: It is useful when data has no labels, such as customer segmentation or discovering hidden patterns in transactions.

Data Latih (Training Data)

Training Data is the dataset used to train an AI model to recognize patterns. The quality and quantity of this data largely determine the model's accuracy: bad data produces a bad model.

Why it matters: Many AI projects fail not because of the algorithm, but because of training data that is dirty, scarce, or unrepresentative of real-world conditions.

Overfitting

Overfitting happens when a model learns the training data too closely, including its random noise, so it performs well on training data but poorly on new data. The model memorizes rather than understands.

Why it matters: Overfitting is a classic trap: the model looks great in internal testing but fails on real data in production.

Model AI

An AI model is the output of the training process: a file containing patterns learned from data, usable to make predictions or decisions on new input, such as guessing the category of an email.

Why it matters: The model is the core asset of an AI solution. Once trained, it needs to be deployed so real applications can use it.

Generative AI

Generative AI is a type of AI that produces new content, such as text, images, code, or audio, based on patterns learned from large amounts of data. Popular examples include ChatGPT, Claude, and Gemini.

Why it matters: Generative AI changes how teams work: drafting content, summarizing documents, and writing code become far faster.

NLP & Large Language Models

Natural Language Processing (NLP)

NLP is the field of AI that enables computers to understand, interpret, and generate human language, both text and speech. Examples include sentiment analysis, translation, and chatbots.

Why it matters: NLP underpins chatbots, customer review analysis, and document automation that many Indonesian companies request.

Large Language Model (LLM)

A Large Language Model is a very large AI model trained on massive amounts of text to understand and generate language. LLMs can answer questions, write, summarize, and reason. ChatGPT, Claude, and Gemini are examples.

Why it matters: LLMs power today's generative AI wave and underpin enterprise chatbots, internal assistants, and text-work automation.

Token

A token is the smallest piece of text an LLM processes, which can be a word, part of a word, or punctuation. Models read and generate text token by token, and LLM usage cost is usually billed per token.

Why it matters: Understanding tokens matters for managing cost and input-length limits when building LLM-based applications.

Embedding

An embedding is a representation of text, images, or other data as a list of numbers (a vector) that captures its meaning. Data with similar meaning have nearby vectors, letting computers compare similarity.

Why it matters: Embeddings are the foundation of semantic search and RAG, letting systems find relevant documents by meaning rather than keywords.

Prompt

A prompt is the instruction or question given to a generative AI model to produce a response. Prompt quality strongly affects output quality: clear, specific prompts yield better answers.

Why it matters: Writing good prompts is a core skill for using LLMs effectively, both for individuals and for enterprise automation.

Prompt Engineering

Prompt Engineering is the practice of designing and refining prompts so an AI model produces accurate, consistent, and fit-for-purpose output. Techniques include giving examples, context, a role, and a desired answer format.

Why it matters: Good prompt engineering can make the same LLM deliver far higher quality with no extra training cost.

RAG (Retrieval-Augmented Generation)

RAG is a technique that combines an LLM with an external knowledge base. Before answering, the system first retrieves relevant documents, then composes its answer from them, making responses more accurate and able to cite sources.

Why it matters: RAG is the foundation of enterprise chatbots that answer from internal documents such as SOPs, product catalogs, or regulations, without retraining the model.

Fine-tuning

Fine-tuning is the process of retraining an existing model with additional, specific data so it better fits a particular task or domain, such as a company's tone of voice or an industry's terminology.

Why it matters: Fine-tuning helps when prompting and RAG are not enough, but it costs more; RAG is often tried first before fine-tuning.

Halusinasi (Hallucination)

Hallucination is when a generative AI model produces information that sounds convincing but is actually wrong or fabricated. The model answers confidently even when the facts are incorrect.

Why it matters: Hallucination is a key risk of LLMs for business use. Techniques such as RAG and source verification are used to reduce it.

Context Window

A context window is the maximum amount of text (measured in tokens) an LLM can process at once, including both prompt and answer. Exceeding the limit can cause the earliest information to be cut off or forgotten.

Why it matters: The context window determines how much document or conversation history can be fed at once, which is important when designing LLM applications.

Transformer

The Transformer is a neural network architecture underlying nearly all modern LLMs. Its key innovation, the attention mechanism, lets the model weigh which parts of the input are most relevant when processing language.

Why it matters: The Transformer is the breakthrough that enabled LLMs like GPT and Claude; the term appears in almost every modern AI technical discussion.

AI Agent

An AI Agent is an LLM-based system that can take steps autonomously to complete a task, such as calling tools, searching for information, and executing actions, rather than just answering a single question.

Why it matters: Agents are the next direction of AI automation: not just answering, but carrying out workflows such as replying to emails or managing data.

Vector Database

A Vector Database is a type of database that stores embeddings (vectors) and is designed to quickly find data most similar in meaning. Examples include Pinecone, Weaviate, and pgvector.

Why it matters: Vector databases are a key component of RAG and semantic search; without them, finding relevant documents among thousands of files is slow.

Analisis Sentimen (Sentiment Analysis)

Sentiment Analysis is an NLP technique for measuring whether a text is positive, negative, or neutral. It is often used to analyze product reviews, social media comments, and customer feedback.

Why it matters: It helps companies monitor brand perception and automatically surface customer complaints at scale.

Computer Vision

Computer Vision is the field of AI that enables computers to understand the content of images and video, such as recognizing objects, reading text, or detecting motion. It digitally mimics human sight.

Why it matters: Computer Vision is widely used in Indonesian manufacturing, mining, and retail for quality inspection, workplace safety, and visitor analytics.

Object Detection

Object Detection is a Computer Vision technique for locating and marking specific objects in an image or video with bounding boxes, while also recognizing each object's type.

Why it matters: It is the core of many real solutions: counting people, detecting safety PPE, monitoring vehicles, or watching restricted areas.

Image Segmentation

Image Segmentation is a Computer Vision technique that groups image pixels by object or region, letting a model mark the precise shape of objects rather than just a bounding box.

Why it matters: It is used when high precision is needed, such as measuring the area of a defect on a product or mapping regions from drone imagery.

OCR (Optical Character Recognition)

OCR is technology that converts text in an image or scanned document into editable, searchable digital text. Examples include reading ID cards, invoices, license plates, or forms.

Why it matters: OCR is one of the fastest-ROI AI use cases in Indonesia: automating data entry from ID cards, invoices, and physical documents.

YOLO (You Only Look Once)

YOLO is a family of object detection models known for being fast because it detects objects in a single pass over an image. It suits real-time detection on video and CCTV cameras.

Why it matters: YOLO is often chosen for real-time solutions such as people counters and safety detection because it is light and fast.

Klasifikasi Gambar (Image Classification)

Image Classification is a Computer Vision task that assigns a category to an entire image, for example deciding whether a product photo is defective or normal.

Why it matters: It is one of the simplest, fastest-to-deploy use cases and is often the starting point for Computer Vision projects in factories.

Pengenalan Wajah (Facial Recognition)

Facial Recognition is Computer Vision technology that identifies or verifies a person's identity from their face. It is used for attendance, security access, and identity verification.

Why it matters: It is widely requested for employee attendance and access control, but requires care around privacy and personal data compliance.

Anomaly Detection

Anomaly Detection is an AI technique for finding data or events that deviate from normal patterns, such as product defects, suspicious transactions, or unusual behavior in CCTV footage.

Why it matters: It is highly valuable for quality control, fraud detection, and security, because it catches problems hard for humans to spot manually.

MLOps & Engineering

MLOps

MLOps is the set of practices and tools for managing the AI model lifecycle in production: training, deploying, monitoring, and updating models reliably and continuously, much like DevOps for Machine Learning.

Why it matters: Without MLOps, AI models often stall at the experiment stage; MLOps is what makes models actually run stably in production.

Inference

Inference is the process of using a trained model to produce predictions on new data. If training is learning, inference is applying what was learned when the model is put to use.

Why it matters: Inference cost and speed determine the economic viability of an AI solution, especially for applications serving many users.

Model Deployment

Model Deployment is the process of putting a trained AI model into a real environment so applications or users can access it, usually via an API or an edge device.

Why it matters: Deployment is the bridge between experiment and business impact; without it, a model is just a research project that creates no value.

Latency

Latency is the time an AI system takes to respond to a request, from input arriving to answer returning. Low latency means fast responses, important for real-time applications.

Why it matters: High latency makes an app feel slow even when the model is accurate; it is often decisive for AI product adoption.

Quantization

Quantization is a technique for compressing an AI model by lowering the numerical precision inside it, making the model smaller and faster with a small accuracy drop. It is useful for running models on limited devices.

Why it matters: Quantization lets models run on edge devices or at lower compute cost, which matters for cost efficiency.

Data Pipeline

A Data Pipeline is a series of automated steps to collect, clean, transform, and move data from its sources into an AI or analytics system. A good pipeline keeps data ready-to-use and consistent.

Why it matters: Most of the effort in AI projects is actually in the data pipeline; clean, reliable data is a prerequisite for a working model.

API (Application Programming Interface)

An API is an interface that lets one application call another application's services. In AI, models are usually served via an API so other applications can send data and receive predictions.

Why it matters: APIs are the most common way to integrate AI into existing systems, such as connecting a model to a company's internal applications.

Edge Computing / Edge AI

Edge AI means running AI models directly on devices near the data source, such as cameras, sensors, or machines, rather than in the cloud. This reduces latency and the need for internet connectivity.

Why it matters: It matters for Indonesian factories and mines with limited connectivity, or when instant responses are needed without sending data to the cloud.

GPU (Graphics Processing Unit)

A GPU is a processor designed to compute many operations in parallel. Because Deep Learning requires massive computation, GPUs are the primary hardware for training and running AI models quickly.

Why it matters: GPU availability and cost are often a major consideration when budgeting AI projects, especially for training large models.

Business & Implementation

Use Case AI

An AI Use Case is a specific application of AI to solve one real business problem, such as detecting product defects, forecasting demand, or automating customer service.

Why it matters: Choosing the right use case, one that is high-value and technically feasible, is the most decisive first step toward successful AI adoption.

Proof of Concept (PoC)

A Proof of Concept is an early, small-scale version of an AI solution built to prove the idea is technically feasible and valuable before investing fully in production development.

Why it matters: A PoC reduces risk: a company can test whether AI truly delivers results before committing a large budget.

ROI AI (Return on Investment)

AI ROI measures the value an AI initiative produces relative to its cost, whether as cost savings, revenue gains, or time efficiency. It is calculated to judge whether the AI investment is worthwhile.

Why it matters: Management evaluates AI projects by ROI; use cases with clear, fast ROI are easier to approve and scale.

Predictive Maintenance

Predictive Maintenance uses AI to predict when a machine or piece of equipment is likely to fail, based on sensor data and history, so maintenance can be done just before a breakdown occurs.

Why it matters: It is highly valuable in Indonesian manufacturing and mining because it reduces unexpected downtime and emergency repair costs.

Chatbot AI

An AI Chatbot is a program that can interact with users through text or voice conversation automatically. Modern chatbots use LLMs and are often combined with RAG to answer from a company's knowledge base.

Why it matters: It is one of the most requested use cases for customer service and internal assistants because it saves time and is available 24/7.

Otomasi (Automation)

AI-based Automation is the use of AI to perform repetitive tasks without human intervention, such as sorting emails, extracting data from documents, or classifying support tickets.

Why it matters: Automation is the fastest source of cost savings from AI: freeing staff from repetitive work to focus on high-value tasks.

AI Readiness

AI Readiness is how prepared an organization is to adopt AI, judged by data quality, infrastructure, team skills, and clarity of use cases. Assessing it helps define a realistic first step.

Why it matters: Assessing readiness early prevents AI projects from failing because data or teams are not ready; foundation first, then advanced solutions.

Transformasi Digital (Digital Transformation)

Digital Transformation is the process of changing how an organization works and its business model by leveraging digital technology, including AI and data, to improve efficiency and create new value.

Why it matters: AI is one of the main engines of digital transformation today; many Indonesian companies make AI adoption part of their transformation roadmap.