πŸ“–
Wiki
CNCFSkywardAIHuggingFaceLinkedInKaggleMedium
  • Home
    • πŸš€About
  • πŸ‘©β€πŸ’»πŸ‘©Freesoftware
    • πŸ‰The GNU Hurd
      • πŸ˜„The files extension
      • πŸ“½οΈTutorial for starting
      • 🚚Continue Working for the Hurd
      • πŸš΄β€β™‚οΈcgo
        • πŸ‘―β€β™€οΈStatically VS Dynamically binding
        • 🧌Different ways in binding
        • πŸ‘¨β€πŸ’»Segfault
      • πŸ›ƒRust FFI
    • πŸ§šπŸ»β€β™‚οΈProgramming
      • πŸ“–Introduction to programming
      • πŸ“–Mutable Value Semantics
      • πŸ“–Linked List
      • πŸ“–Rust
        • πŸ“–Keyword dyn
        • πŸ“–Tonic framework
        • πŸ“–Tokio
        • πŸ“–Rust read files
  • πŸ›€οΈAI techniques
    • πŸ—„οΈframework
      • 🧷pytorch
      • πŸ““Time components
      • πŸ““burn
    • 🍑Adaptation
      • 🎁LoRA
        • ℹ️Matrix Factorization
        • πŸ“€SVD
          • ✝️Distillation of SVD
          • 🦎Eigenvalues of a covariance matrix
            • 🧧Eigenvalues
            • πŸͺCovariance Matrix
        • πŸ›«Checkpoint
      • 🎨PEFT
    • πŸ™‹β€β™‚οΈTraining
      • πŸ›»Training with QLoRA
      • 🦌Deep Speed
    • 🧠Stable Diffusion
      • πŸ€‘Stable Diffusion model
      • πŸ“ΌStable Diffusion v1 vs v2
      • πŸ€Όβ€β™€οΈThe important parameters for stunning AI image
      • ⚾Diffusion in image
      • 🚬Classifier Free Guidance
      • ⚜️Denoising strength
      • πŸ‘·Stable Diffusion workflow
      • πŸ“™LoRA(Stable Diffusion)
      • πŸ—ΊοΈDepth maps
      • πŸ“‹CLIP
      • βš•οΈEmbeddings
      • πŸ• VAE
      • πŸ’₯Conditioning
      • 🍁Diffusion sampling/samplers
      • πŸ₯ Prompt
      • πŸ˜„ControlNet
        • πŸͺ‘Settings Explained
        • 🐳ControlNet with models
    • πŸ¦™Large Language Model
      • ☺️SMID
      • πŸ‘¨β€πŸŒΎARM NEON
      • 🍊Metal
      • 🏁BLAS
      • πŸ‰ggml
      • πŸ’»llama.cpp
      • 🎞️Measuring model quality
      • πŸ₯žType for NNC
      • πŸ₯žToken
      • πŸ€Όβ€β™‚οΈDoc Retrieval && QA with LLMs
      • Hallucination(AI)
    • 🐹diffusers
      • πŸ’ͺDeconstruct the Stable Diffusion pipeline
  • 🎹Implementing
    • πŸ‘¨β€πŸ’»diffusers
      • πŸ“–The Annotated Diffusion Model
  • 🧩Trending
    • πŸ“–Trending
      • πŸ“–Vector database
      • 🍎Programming Languages
        • πŸ“–Go & Rust manage their memories
        • πŸ“–Performance of Rust and Python
        • πŸ“–Rust ownership and borrowing
      • πŸ“–Neural Network
        • 🎹Sliding window/convolutional filter
      • Quantum Machine Learning
  • 🎾Courses Collection
    • πŸ“–Courses Collection
      • πŸ“šAcademic In IT
        • πŸ“Reflective Writing
      • πŸ“–UCB
        • πŸ“–CS 61A
          • πŸ“–Computer Science
          • πŸ“–Scheme
          • πŸ“–Python
          • πŸ“–Data Abstraction
          • πŸ“–Object-Oriented Programming
          • πŸ“–Interpreters
          • πŸ“–Streams
      • 🍎MIT Algorithm Courses
        • 0️MIT 18.01
          • 0️Limits and continuity
          • 1️Derivatives
          • 3️Integrals
        • 1️MIT 6.042J
          • πŸ”’Number Theory
          • πŸ“ŠGraph Theory
            • 🌴Graph and Trees
            • 🌲Shortest Paths and Minimum Spanning Trees
        • 2️MIT 6.006
          • Intro and asymptotic notation
          • Sorting and Trees
            • Sorting
            • Trees
          • Hashing
          • Graphs
          • Shortest Paths
          • Dynamic Programming
          • Advanced
        • 3️MIT 6.046J
          • Divide and conquer
          • Dynamic programming
          • Greedy algorithms
          • Graph algorithms
Powered by GitBook
On this page
  • What is it?
  • Working process
  • SVD distillation finds applications
  • The useful tips

Was this helpful?

Edit on GitHub
  1. AI techniques
  2. Adaptation
  3. LoRA
  4. SVD

Distillation of SVD

A machine learning technique that uses SVD.

The SVD distillation technique is a way uses SVD to improve the accuracy of neural network.

What is it?

Distillation works by training a neural network to predict the missing singular values and singular vectors. The neural network is trained on a dataset that has been pre-processed with SVD. The neural network is then used to predict the missing singular values and singular vectors for a new dataset.

In the context of SVD, refers to the process of approximating a given matrix using a lower-rank SVD representation.

The goal of SVD distillation is to find a low-rank approximation that captures the most important patterns and structures in the original matrix while reducing computational complexity and storage requirements.

Working process

The distillation of SVD involves selecting a subset of the largest singular values and their corresponding singular vectors to form a reduced-rank approximation of the original matrix.

By discarding the smaller singular values and their associated vectors, we can effectively compress the information in the matrix while retaining the most significant components.

Follows:

  1. Perform SVD: Start by decomposing the original matrix using SVD, obtaining the singular values and the corresponding left and right singular vectors.

  2. Select significant singular values: Sort the singular values in descending order and select the top-k largest singular values. The value of k determines the desired rank of the approximation.

  3. Truncate singular values and vectors: Keep only the top-k singular values and their corresponding singular vectors from both the left and right singular vector matrices. Discard the rest.

  4. Reconstruct the approximation: Form a lower-rank approximation of the original matrix by multiplying the retained singular values, left singular vectors, and the transpose of the retained right singular vectors.

The resulting approximation matrix captures the dominant patterns and structure of the original matrix using fewer dimensions.

The level of approximation depends on the number of singular values retained, with higher-rank approximations being more faithful to the original matrix but requiring more storage and computational resources.

SVD distillation finds applications

  • Data compression

  • Dimensionality reduction

  • Recommender systems

It allows for efficient representation and analysis of large-scale datasets, while still retaining important information and patterns.

And it is particularly effective for datasets that are high-dimensional and noisy.

The useful tips

It is important to note that the choice of the rank, k, for the approximation is a trade-off between compression and accuracy.

A higher rank retains more information but increases computational complexity and storage requirements, while a lower rank provides greater compression but sacrifices some level of accuracy.

The appropriate rank selection depends on the specific application and the balance desired between accuracy and efficiency.

PreviousSVDNextEigenvalues of a covariance matrix

Last updated 1 year ago

Was this helpful?

πŸ›€οΈ
🍑
🎁
πŸ“€
✝️