172: Transformers and Large Language Models
172: Transformers and Large Language ModelsIntro topic: Is WFH actually WFC?News/Links:Falsehoods Junior Developers Believe about Becoming Seniorhttps://vadimkravcenko.com/shorts/falsehoods-junior-developers-believe-about-becoming-senior/Pure PursuitTutorial with python code: https://wiki.purduesigbots.com/software/control-algorithms/basic-pure-pursuit Video example: https://www.youtube.com/watch?v=qYR7mmcwT2w PID without a PHDhttps://www.wescottdesign.com/articles/pid/pidWithoutAPhd.pdfGoogle releases Gemmahttps://blog.google/technology/developers/gemma-open-models/Book of the ShowPatrick: The Eye of the World by Robert Jordan (Wheel of Time)https://amzn.to/3uEhg6vJason: How to Make a Video Game All By Yourselfhttps://amzn.to/3UZtP7bPatreon Plug https://www.patreon.com/programmingthrowdown?ty=hTool of the ShowPatrick: Stadia Controller Wifi to Bluetooth Unlockhttps://stadia.google.com/controller/index_en_US.htmlJason: FUSE and SSHFShttps://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-sshTopic: Transformers and Large Language ModelsHow neural networks store informationLatent variablesTransformersEncoders & DecodersAttention LayersHistoryRNNVanishing Gradient ProblemLSTMShort term (gradient explodes), Long term (gradient vanishes)Differentiable algebraKey-Query-ValueSelf AttentionSelf-Supervised Learning & Forward ModelsHuman FeedbackReinforcement Learning from Human FeedbackDirect Policy Optimization (Pairwise Ranking)
★ Support this podcast on Patreon ★
]]
Programming Throwdown
Patrick Wheeler and Jason Gauci
171: Compilers and Interpreters
173: Mocking and Unit Tests
**172: Transformers and Large Language Models
**Intro topic: Is WFH actually WFC?
**News/Links:
- Falsehoods Junior Developers Believe about Becoming Senior
- Pure Pursuit
- Tutorial with python code: https://wiki.purduesigbots.com/software/control-algorithms/basic-pure-pursuit
- Video example: https://www.youtube.com/watch?v=qYR7mmcwT2w
- PID without a PHD
- Google releases Gemma
Book of the Show
- Patrick: The Eye of the World by Robert Jordan (Wheel of Time)
- Jason: How to Make a Video Game All By Yourself
Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h
Tool of the Show
- Patrick: Stadia Controller Wifi to Bluetooth Unlock
- Jason: FUSE and SSHFS
**Topic: Transformers and Large Language Models
- How neural networks store information
- Latent variables
- Transformers
- Encoders & Decoders
- Attention Layers
- History
- RNN
- Vanishing Gradient Problem
- LSTM
- Short term (gradient explodes), Long term (gradient vanishes)
- Differentiable algebra
- Key-Query-Value
- Self Attention
- Self-Supervised Learning & Forward Models
- Human Feedback
- Reinforcement Learning from Human Feedback
- Direct Policy Optimization (Pairwise Ranking)
📄 pidWithoutAPhd.pdf