PostHole
Compose Login
You are browsing us.zone2 in read-only mode. Log in to participate.
rss-bridge 2024-11-04T16:00:00+00:00

177: Vector Databases

Intro topic:  Buying a CarNews/Links:Cognitive Load is what Mattershttps://github.com/zakirullin/cognitive-loadDiffusion models are Real-Time Game Engineshttps://gamengen.github.io/Your Company Needs Junior Devshttps://softwaredoug.com/blog/2024/09/07/your-team-needs-juniorsSeamless Streaming / Fish Speech / LLaMA OmniSeamless: https://huggingface.co/facebook/seamless-streamingFish: https://github.com/fishaudio/fish-speech LLaMA Omni: https://github.com/ictnlp/LLaMA-Omni Book of the ShowPatrick: Thought Emporium Youtubehttps://youtu.be/8X1_HEJk2Hw?si=T8EaHul-QMahyUvQJason: Novel Mindshttps://www.novelminds.ai/Patreon Plug https://www.patreon.com/programmingthrowdown?ty=hTool of the ShowPatrick: Escape Simulatorhttps://pinestudio.com/games/escape-simulator/Jason: Cursor IDEhttps://www.cursor.com/Topic: Vector Databases (~54 min)How computers represent data traditionallyASCII valuesRGB valuesHow traditional compression worksHuffman encoding (tree structure)Lossy example: Fourier Transform & store coefficientsHow embeddings are computedPairwise (contrastive) methodsForward models (self-supervised)Similarity metricsApproximate Nearest Neighbors (ANN)Sub-Linear ANNClusteringSpace Partitioning (e.g. K-D Trees)What a vector database doesPerform nearest-neighbors with many different similarity metricsStore the vectors and the data structures to support sub-linear ANNHandle updates, deletes, rebalancing/reclustering, backups/restoresExamplespgvector: a vector-database plugin for postgresWeaviate, Pinecone Milvus

★ Support this podcast on Patreon ★
]]


Programming Throwdown

Patrick Wheeler and Jason Gauci

176: MLOps at SwampUp

178: Working from Home

Download Audio File

**Intro topic:  Buying a Car

**News/Links:

  • Cognitive Load is what Matters
  • Diffusion models are Real-Time Game Engines
  • Your Company Needs Junior Devs
  • Seamless Streaming / Fish Speech / LLaMA Omni

Book of the Show

  • Patrick:
  • Thought Emporium Youtube
  • Jason:
  • Novel Minds

Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h

Tool of the Show

  • Patrick:
  • Escape Simulator
  • Jason:
  • Cursor IDE

**Topic: Vector Databases (~54 min)

  • How computers represent data traditionally
  • ASCII values
  • RGB values
  • How traditional compression works
  • Huffman encoding (tree structure)
  • Lossy example: Fourier Transform & store coefficients
  • How embeddings are computed
  • Pairwise (contrastive) methods
  • Forward models (self-supervised)
  • Similarity metrics
  • Approximate Nearest Neighbors (ANN)
  • Sub-Linear ANN
  • Clustering
  • Space Partitioning (e.g. K-D Trees)
  • What a vector database does
  • Perform nearest-neighbors with many different similarity metrics
  • Store the vectors and the data structures to support sub-linear ANN
  • Handle updates, deletes, rebalancing/reclustering, backups/restores
  • Examples
  • pgvector: a vector-database plugin for postgres
  • Weaviate, Pinecone
  • Milvus

Original source

Reply