Glossary of Machine Learning and AI Terms

Feel free to send me your thoughts, notes, other references. My contact details you can find on LinkedIn

Total terms in document: 53

Index

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A

Autoencoders

Autoencoders are neural network architectures used for unsupervised learning. They are designed to compress input data into a lower-dimensional latent space (encoding) and then reconstruct the original data from this compressed representation (decoding). The primary objective is to minimize the reconstruction error, typically measured as the difference between input and output.

Applications:

  1. Dimensionality Reduction: Acts as a non-linear alternative to Principal Component Analysis (PCA).
  2. Feature Learning: Learns compact and meaningful representations of data.
  3. Denoising: Removes noise from corrupted data by training on clean samples.
  4. Anomaly Detection: Identifies unusual patterns by observing high reconstruction errors.

Variants:

  • Sparse Autoencoders: Encourage sparsity in the hidden units to create compressed and interpretable features.
  • Denoising Autoencoders: Add noise to inputs during training, forcing the network to learn robust features.
  • Convolutional Autoencoders: Specialize in image data, leveraging convolutional layers for spatial feature learning.

Key References:


Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are computational processing systems inspired by biological nervous systems (e.g., the human brain).


B

Base model

The original, foundational version of a large language model, which has not been fine-tuned.


BERT

Bidirectional Encoder Representations from Transformers. Developed in 2018.
BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.
BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

Key References:


BERTScore

  • Description: Leverages contextual embeddings from BERT to evaluate semantic similarity between the generated and reference text.
  • Purpose: Captures semantic similarity more effectively than traditional n-gram-based metrics.
Metric Range Interpretation Example Values
BERTScore (Precision) 0 to 1 Measures how much of the generated text aligns with the reference. 0.7 (moderate), 0.85 (good), 0.95 (excellent)
BERTScore (Recall) 0 to 1 Measures how much of the reference text is captured in the generated output. 0.6 (moderate), 0.8 (good), 0.9 (excellent)
BERTScore (F1) 0 to 1 Harmonic mean of precision and recall; overall measure of similarity. 0.65 (moderate), 0.82 (good), 0.9 (excellent)

Bias

The disproportionate favor or prejudice towards a specific item or group. AI algorithms may inherit biases from historical data or human trainers, risking perpetuation of these biases in predictions.


BLEU

BLEU (Bilingual Evaluation Understudy) is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Quality is considered to be the correspondence between a machine’s output and that of a human: “the closer a machine translation is to a professional human translation, the better it is” – this is the central idea behind BLEU.
BLEU was one of the first metrics to claim a high correlation with human judgements of quality, and remains one of the most popular automated and inexpensive metrics.
Key References:


BLEU score

  • BLEU scores are calculated for individual translated segments—generally sentences—by comparing them with a set of good quality reference translations. Those scores are then averaged over the whole corpus to reach an estimate of the translation’s overall quality. Neither intelligibility nor grammatical correctness are not taken into account.
  • The BLEU algorithm compares consecutive phrases of the automatic translation with the consecutive phrases it finds in the reference translation, and counts the number of matches, in a weighted fashion. These matches are position independent. A higher match degree indicates a higher degree of similarity with the reference translation, and higher score. Intelligibility and grammatical correctness aren’t taken into account.
    Read article “How to Compute BLEU Score” on geeksforgeeks.org

Bokeh

Bokeh is an interactive visualization library for Python that enables the creation of dynamic and visually appealing plots, dashboards, and data applications. It provides high-performance tools to generate interactive charts, including scatter plots, bar charts, line graphs, and more, with support for handling large datasets. Bokeh seamlessly integrates with Pandas, NumPy, and other data manipulation libraries, making it a powerful choice for creating web-ready visualizations.
*Example “Bokeh” for Movie explorer

Key References:
Bokeh Homepage
Bokeh Official Documentation

C

Convolutional layer

A core component of CNNs that processes input data using filters (kernels) to produce feature maps, identifying patterns or features.


Convolutional Neural Network (CNN)

  1. CNN - a type of neural network specialized for analyzing visual data, learning features via filter optimization.
  2. CNN - similar to ANNs but optimized for image data, with layers designed for feature extraction and classification.

Key References:


D

Dataset

The training data used to teach an LLM patterns and relationships.


F

False positive

An incorrect prediction where a model identifies a condition or class that is not present.


F1 score

A performance metric for classification models combining precision and recall into a single value ranging from 0 (poor) to 1 (excellent).


Few-shot

Using a small number of examples to guide the model in performing a new task.


Fine-tuning

Adapting a pre-trained model to a specific task or domain by training it on a smaller, specialized dataset.


G

Generative AI

AI systems capable of creating new content, such as text, images, or audio.


H

Hallucination

When an LLM generates plausible but factually incorrect or nonsensical information.


HoloViz

HoloViz - high-level tools to simplify visualization in Python.

HoloViz provides:

  • High-level tools that make it easier to apply Python plotting libraries to your data.
  • A comprehensive tutorial showing how to use the available tools together to do a wide range of different tasks.
    HoloViz homepage

I

Inference

The process of using a trained model to make predictions or generate outputs.


L

LanguageTool

LanguageTool is an AI-based grammar checker. Paste your text or start typing below to check grammatical errors, and spelling mistakes across languages.
LanguageTool home page

LCS

LCS = Longest Common Subsequence


LoRA

Low-Rank Adaptation, a fine-tuning method requiring less computational resources compared to full fine-tuning.


M

Matplotlib

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It offers a versatile set of tools to plot data in various formats, ranging from simple line graphs to complex 3D plots. Widely used in scientific computing, data analysis, and engineering, Matplotlib provides extensive customization options for precise control over visual outputs.

Key References:
Matplotlib Homepage
Matplotlib Official user guide documentation


Machine Learning Operations (MLOps)

The process of deploying, monitoring, and updating machine learning models in production environments.


METEOR

Metric. METEOR = Metric for Evaluation of Translation with Explicit ORdering

  • Description: Evaluates semantic similarity by considering unigram overlaps, stemming, synonyms, and paraphrasing.
  • Score Parameter: “METEOR Score”, ref. METEOR-Score
  • Purpose: Provides a balanced metric for machine translation and summarization tasks.

METEOR Score

  • Metric: METEOR
  • Range: 0 to 1
  • Interpretation: Combines precision and recall, with semantic similarity (e.g., synonyms, stems)
  • Example Values: Higher scores indicate better matches. 0.3 (low), 0.6 (moderate), 0.85 (high)

MLOps

Short for Machine Learning Operations.


N

Named Entity Recognition

Named Entity Recognition = NER

  • Named Entity Recognition seeks to extract substrings within a text that name real-world objects and to determine their type (for example, whether they refer to persons or organizations).
    Paper: A survey on recent advances in Named Entity Recognition, 2024.
  • Named Entity Recognition (NER) is a sub-task of information extraction in Natural Language Processing (NLP) that classifies named entities into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, and more.

Key References:


Neural Process Family

Neural Process Family = NPF


P

Pandas

Pandas is a powerful data manipulation and analysis library for Python. It provides data structures like DataFrame and Series for efficiently handling structured data, enabling operations such as filtering, transformation, and aggregation. Widely used in data science, machine learning, and statistical analysis, Pandas simplifies working with large datasets and offers seamless integration with other libraries.

Key References:


Perplexity

A measure of how well a language model predicts a sample of text, with lower scores indicating better performance.

Range Interpretation Example Values
1 to ∞ Measures the uncertainty of the model’s predictions. Lower perplexity indicates better performance. A perplexity of 1 means perfect predictions, while higher values indicate more uncertainty and worse performance. 20 (good), 50 (average), 200 (poor)

Plotly

Plotly is a versatile data visualization platform that allows users to create interactive, web-based charts, graphs, and dashboards. It supports various programming languages, including Python, R, and JavaScript, making it popular among developers, data analysts, and scientists. Plotly’s library provides tools for producing high-quality visualizations ranging from basic plots to complex 3D and geographic maps, with seamless integration for sharing and embedding visual content.


Pooling layers

Layers in CNNs designed to reduce the dimensionality of feature maps, lowering computational complexity.


Precision

A metric measuring the ratio of true positives to all predicted positives in a classification model.


Pre-trained model

A model that has been trained on a dataset and may be further fine-tuned for specific tasks.


Prompt engineering

The art of crafting effective inputs to elicit desired output from an LLM.


Prompt template

Specially formatted instructions used by LLMs to define the input and output.


Q

Qlik

Qlik is a powerful data analytics and business intelligence platform that enables users to visualize data, generate insights, and make data-driven decisions. It offers tools for creating dynamic dashboards, performing real-time analytics, and integrating data from multiple sources. Qlik’s associative engine and AI capabilities allow users to explore data interactively and uncover hidden relationships.

Key References:


R

R.A.G. (Retrieval-Augmented Generation)

A technique combining external information retrieval with the generative capabilities of an LLM for improved accuracy.


ROUGE

ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

  • Score parameters:
    • ROUGE-1: Range: 0 to 1, Measures the overlap of unigram (single word) between the generated and reference text. 0.2 (low overlap), 0.5 (moderate), 0.8 (high)
    • ROUGE-2: Range: 0 to 1, Measures the overlap of bigrams (two consecutive words) between generated and reference text. 0.1 (low), 0.4 (moderate), 0.7 (high)
    • ROUGE-L: Range: 0 to 1, Measures the longest common subsequence (LCS) between the generated and reference text. 0.3 (low), 0.6 (moderate), 0.9 (high)
    • ROUGE-Lsum is specifically designed for summarization tasks, particularly when evaluating summaries. It computes the ROUGE-L score for the summarization task based on how well the generated summary matches the reference summary.
  • Purpose: Evaluates lexical similarity, fluency, and coherence across different levels (unigrams, bigrams, and sequences).

Key References:


ROUGE-L

ROUGE-L = ROUGE (Recall-Oriented Understudy for Gisting Evaluation) + L (Longest Common Subsequence (LCS)
ROUGE-L captures the longest sequence of words that appear in both texts in the same order, providing insights into fluency and coherence.

Key References:


S

Seaborn

Seaborn is a Python data visualization library built on top of Matplotlib. It provides a high-level interface for creating attractive and informative statistical graphics. Seaborn simplifies the process of exploring and visualizing data through built-in themes, color palettes, and functions for visualizing distributions, relationships, and categorical data.

Key References:


Special Token

Reserved tokens used in LLMs for specific functions, such as defining the start or end of a response.


Softmax function

The Softmax function is a mathematical function often used in machine learning and artificial intelligence, particularly in classification tasks. It transforms a vector of raw scores (logits) into probabilities by exponentiating each score and normalizing them. The resulting values lie between 0 and 1 and sum to 1, making them interpretable as probabilities for each class in multi-class classification problems.
In the context of AI, the Softmax function is typically applied in the final layer of neural networks, especially in models like classifiers, to predict the most likely class. It ensures that the network outputs a probability distribution, which is useful for decision-making in tasks like image classification, language processing, and more.

Key References:
Read post about “Softmax function” on notes.theomorales.com


Supervised learning

Learning through pre-labeled inputs, where the model aims to reduce classification error by predicting the correct outputs.
It works opposite on unsupervised learning


System prompts

Instructions defining an LLM’s behavior, role, or context in a conversation.


T

Tableau

Tableau is a leading data visualization and business intelligence tool that helps users analyze, visualize, and share insights from their data. It provides intuitive drag-and-drop features for creating interactive dashboards, reports, and charts. Tableau supports integration with various data sources and is widely used for turning complex data into actionable insights, enhancing decision-making across industries.


Token

The smallest unit of text processed by an LLM, such as a word or subword.


Tokenization

Breaking text into tokens for model processing.


Training

Feeding a model with data to allow it to learn patterns and relationships.


U

Unsupervised learning

Learning from data without labeled outputs, often for tasks like clustering or dimensionality reduction.
It works opposite on supervised learning


V

Variational Autoencoders (VAEs)

Variational Autoencoders (VAEs) are a probabilistic extension of autoencoders that learn not only to compress data but also to generate new samples by modeling data distributions. VAEs use a latent space with a probabilistic structure, enabling meaningful interpolation between points in the latent space.
Features:

  1. Latent Space Regularization: Ensures the latent space follows a predefined probability distribution, commonly Gaussian.
  2. Reconstruction and Generation: Balances reconstruction accuracy with the regularization term using a loss function derived from the evidence lower bound (ELBO).
  3. Bayesian Interpretation: The encoding process approximates posterior distributions via variational inference.

Applications:

  • Data Generation: Generate novel and coherent samples (e.g., synthetic images, text).
  • Anomaly Detection: Identifies data points that deviate from the learned distribution.
  • Latent Space Manipulation: Enables interpolation and arithmetic operations in the latent space.

Key References:


Vega-Altair

Vega-Altair is a declarative statistical visualization library for Python, built on the Vega-Altair and Vega-Lite visualization grammars. It enables users to create elegant and interactive visualizations with concise, human-readable code. Altair leverages Pandas data structures for seamless integration and supports a wide range of chart types, including scatter plots, bar charts, heatmaps, and more. Its declarative approach simplifies the creation of complex visualizations by focusing on data relationships rather than intricate plotting details.

Key References:


Z

Zero-padding (CNN)

A technique in CNNs that pads the borders of input data with zeros, controlling output dimensionality.


Zero-shot

A model’s ability to perform tasks it was not explicitly trained for by leveraging general knowledge.


Other AI glossaries

  1. github.com/nomic-ai/gpt4all | Generative AI Terminology
  2. nhsx.github.io | AI Dictionary