Build a Large Language Model (From Scratch)

ISBN-13: 978-1633437166, ISBN-10: 1633437167
4.6 on Goodreads
(150)
Best Seller in Artificial Intelligence Expert Systems
Loading Images...
Double-tap to zoom
Enjoy fast, free delivery, exclusive deals, and award-winning movies & TV shows.
$47.40 with 21 percent savings
List Price: $59.99
FREE Returns
FREE delivery Monday, June 30
Or Prime members get FREE delivery Tomorrow, June 26. Order within 8 hrs 6 mins.
In Stock
$$47.40 () Includes selected options. Includes initial monthly payment and selected options. Details
Price
Subtotal
$$47.40
Subtotal
Initial payment breakdown
Shipping cost, delivery date, and order total (including tax) shown at checkout.
Shipped & Sold by
Amazon.com
Amazon.com
Shipped & Sold by
Amazon.com
Payment
Secure transaction
Your transaction is secure
We work hard to protect your security and privacy. Our payment security system encrypts your information during transmission. We don’t share your credit card details with third-party sellers, and we don’t sell your information to others. Learn more
Item couldn't be saved. Please try again later. This item could not be removed from your list. Please try again later
{"mobile_buybox_group_1":[{"displayPrice":"$47.40","priceAmount":47.40,"currencySymbol":"$","integerValue":"47","decimalSeparator":".","fractionalValue":"40","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"lziNzhkpt%2B%2BOcKPv4ccqvr90iyrgWyDL7u%2Fr8El9%2FJNkxS1mgz9hy7pcRKYHw4fTma0fodpoIam0AJd49rXQ1ubOQSFXXqd5DxTXu7BePgvreZjAXGTxKq3kXWkU4XYPZKeZux8Nx%2FM9%2FF0SAbkRYg%3D%3D","locale":"en-US","buyingOptionType":"NEW","aapiBuyingOptionIndex":0}]}
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Book details

Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.
Learn how to create, train, and tweak large language models (LLMs) by building one from the ground up!

In
Build a Large Language Model (from Scratch) bestselling author Sebastian Raschka guides you step by step through creating your own LLM. Each stage is explained with clear text, diagrams, and examples. You’ll go from the initial design and creation, to pretraining on a general corpus, and on to fine-tuning for specific tasks.

Build a Large Language Model (from Scratch) teaches you how to:

• Plan and code all the parts of an LLM
• Prepare a dataset suitable for LLM training
• Fine-tune LLMs for text classification and with your own data
• Use human feedback to ensure your LLM follows instructions
• Load pretrained weights into an LLM

Build a Large Language Model (from Scratch) takes you inside the AI black box to tinker with the internal systems that power generative AI. As you work through each key stage of LLM creation, you’ll develop an in-depth understanding of how LLMs work, their limitations, and their customization methods. Your LLM can be developed on an ordinary laptop, and used as your own personal assistant.

Purchase of the print book includes a free eBook in PDF and ePub formats from Manning Publications.

About the technology

Physicist Richard P. Feynman reportedly said, “I don’t understand anything I can’t build.” Based on this same powerful principle, bestselling author Sebastian Raschka guides you step by step as you build a GPT-style LLM that you can run on your laptop. This is an engaging book that covers each stage of the process, from planning and coding to training and fine-tuning.

About the book

Build a Large Language Model (From Scratch) is a practical and eminently-satisfying hands-on journey into the foundations of generative AI. Without relying on any existing LLM libraries, you’ll code a base model, evolve it into a text classifier, and ultimately create a chatbot that can follow your conversational instructions. And you’ll really understand it because you built it yourself!

What's inside

• Plan and code an LLM comparable to GPT-2
• Load pretrained weights
• Construct a complete training pipeline
• Fine-tune your LLM for text classification
• Develop LLMs that follow human instructions

About the reader

Readers need intermediate Python skills and some knowledge of machine learning. The LLM you create will run on any modern laptop and can optionally utilize GPUs.

About the author

Sebastian Raschka is a Staff Research Engineer at Lightning AI, where he works on LLM research and develops open-source software.

The technical editor on this book was
David Caswell.

Table of Contents

1 Understanding large language models
2 Working with text data
3 Coding attention mechanisms
4 Implementing a GPT model from scratch to generate text
5 Pretraining on unlabeled data
6 Fine-tuning for classification
7 Fine-tuning to follow instructions
A Introduction to PyTorch
B References and further reading
C Exercise solutions
D Adding bells and whistles to the training loop
E Parameter-efficient fine-tuning with LoRA

Review

The most comprehensive book I've seen on building LLMs. Highly recommended! -- Raul Ciotescu, CTO, Netzinkubator Software

A clear, hands-on guide that empowers readers to build their own models and explore the cutting edge of AI. -- Guillermo Alcántara, Project manager, PepsiCo Global

Must-have resource for quickly getting up to speed on LLMs. Whether you're new to the field or looking to deepen your knowledge, it’s the perfect guide. -- Walter Reade, Staff Developer Relations Engineer, Kaggle/Google

A fantastic resource for diving into LLMs—a must-read for anyone eager to get hands-on! -- Dr. Vahid Mirjalili, Senior Data Scientist, FM Global

From the Back Cover

From the back cover:

Build a Large Language Model (From Scratch) is a practical and eminently-satisfying hands-on journey into the foundations of generative AI. Without relying on any existing LLM libraries, you'll code a base model, evolve it into a text classifier, and ultimately create a chatbot that can follow your conversational instructions. And you'll really understand it because you built it yourself!

About the reader:

Readers need intermediate Python skills and some knowledge of machine learning. The LLM you create will run on any modern laptop and can optionally utilize GPUs.

About the Author

Sebastian Raschka has been working on machine learning and AI for more than a decade. Sebastian joined Lightning AI in 2022, where he now focuses on AI and LLM research, developing open-source software, and creating educational material. Prior to that, Sebastian worked at the University of Wisconsin-Madison as an assistant professor in the Department of Statistics, focusing on deep learning and machine learning research. He has a strong passion for education and is best known for his bestselling books on machine learning using open-source software.
Popular Highlights in this book
What are popular highlights?

Highlights

Kindle readers can highlight text to save their favorite concepts, topics, and passages to their Kindle app or device. The popular highlights below are some of the most common ones Kindle readers have saved.

About the author

Follow authors to get new release updates, plus improved recommendations.
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Sebastian Raschka, PhD is an LLM Research Engineer with over a decade of experience in artificial intelligence. His work bridges academia and industry, including roles as senior engineering staff at an AI company and a statistics professor.

As an independent researcher and industry expert, Sebastian collaborates with companies on AI solutions and serves on the Open Source Advisory Board at University of Wisconsin–Madison.

Sebastian specializes in LLMs and the development of high-performance AI systems, with a deep focus on practical, code-driven implementations.

Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Frequently bought together

Build a Large Language Model (From Scratch)
+
AI Engineering: Building Applications with Foundation Models
+
Hands-On Large Language Models: Language Understanding and Generation

Frequently bought together

Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.
Total price: $00
Details
Added to Cart
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

From the Publisher

Build LLM (From Scratch) header

right quote

“The most understandable and comprehensive

explanation of language models yet! Its unique

and practical teaching style achieves a level of

understanding you can’t get any other way.”

Cameron Wolfe, Senior Scientist, Netflix

middle quote

“Sebastian combines deep knowledge with

practical engineering skills and a knack for

making complex ideas simple. This is the guide

you need!”

Chip Huyen, author of Designing Machine

Learning Systems and AI Engineering

left quote

“Definitive, up-to-date coverage. Highly

recommended!”

Dr. Vahid Mirjalili, Senior Data Scientist, FM

Global

about the book

why this book?

Build a Large Language Model (From Scratch) offers a practical, hands-on approach to understanding and constructing large language models (LLMs) from the ground up.

By guiding you through each stage—from data preparation and coding attention mechanisms to pretraining and fine-tuning—this book demystifies the inner workings of LLMs using Python and PyTorch.

Ideal for developers and machine learning enthusiasts, it empowers you to build a functional GPT-style model on a standard laptop, fostering a deeper comprehension of generative AI technologies.

about manning

about Manning

Manning helps developers and tech professionals stay ahead in a fast-moving industry with expert-led books, videos, and projects. Learning never stops, but it’s hard to keep up, so we focus on content that’s practical, clear, and trusted. As an independent publisher, we adapt quickly, from pioneering early-access books to offering DRM-free eBooks. Our series, like "In Action" and "In a Month of Lunches", reflect a commitment to making complex topics accessible.

LLMs in Production: From language models to successful products
AI Agents in Action
Natural Language Processing in Action, Second Edition
Effective Conversational AI: Chatbots that work
Data Analysis with LLMs: Text, tables, images and sound (In Action)
Causal AI
Customer Reviews
4.5 out of 5 stars 16
4.4 out of 5 stars 12
4.7 out of 5 stars 4
4.8 out of 5 stars 5
5.0 out of 5 stars 1
3.9 out of 5 stars 5
Price $50.66 $39.84 $59.99 $57.13 $39.99 $53.04
Level of proficiency Intermediate Intermediate Intermediate Intermediate Intermediate Advanced
About the reader For data scientists and ML engineers. For intermediate Python programmers. For intermediate Python programmers. For developers, engineers, and product managers. For data scientists and data analysts. For data scientists and machine learning engineers.
Special features Includes liveBook with out built-in AI assistant. Includes liveBook with out built-in AI assistant. Includes liveBook with out built-in AI assistant. Includes liveBook with out built-in AI assistant. Includes liveBook with out built-in AI assistant. Includes liveBook with out built-in AI assistant.
Pages 456 344 688 328 232 520

Product information

Publisher Manning
Publication date October 29, 2024
Language ‎English
Print length 368 pages
ISBN-10 1633437167
ISBN-13 978-1633437166
Item Weight ‎1.35 pounds
Dimensions 7.38 x 0.7 x 9.25 inches
Best Sellers Rank
Customer Reviews 4.6 out of 5 stars 212Reviews

Customers say

Customers find this book to be a comprehensive guide to mastering large language models, with clear explanations and well-written content. They appreciate the code knowledge, with one customer highlighting the detailed algorithm step-by-step approach, and another noting the inclusion of example code covering byte pair encoding. The book receives positive feedback for its attention mechanisms, and customers consider it worth the investment.

Customers praise the book's comprehensive coverage of large language models, with one customer noting how it breaks down complex concepts into fundamental principles.

"...The language is perfect. So many concepts that I’ve struggled with for a while are laid out so clearly...." Read more

"...You can modify the code easily and learn a lot. Imho this is very good investment for anyone who wants to learn how LLM work" Read more

"...definitely had some good points not covered in classes!" Read more

"I appreciated the book for its thoroughness and attention to detail...." Read more

Customers find the book highly readable, with multiple reviews describing it as the best available on LLMs, and one customer noting its clear writing style.

"...I’ve only made it through the first two chapters but so far absolutely amazing. The language is perfect...." Read more

"This is a very good book. I recommend to do the code exercises along reading...." Read more

"As an Undergraduate in Intelligent Systems Engineering, this book is amazing. definitely had some good points not covered in classes!" Read more

"...This book is extremely well written and clear, builds each component in the Transformer Architecture piece by piece, it makes me feel like I can..." Read more

Customers appreciate the code knowledge in the book, with multiple customers noting that the explanations are simple and the code is great. One customer specifically mentions that it includes example code covering byte pair encoding.

"...I recommend to do the code exercises along reading. The author provides all the code, and it's easy to follow in notebooks to really see what is..." Read more

"I appreciated the book for its thoroughness and attention to detail...." Read more

"This book shows step by step all ingredients which are put together in order to build a GPT-2 model from scratch...." Read more

"...The supplemental materials for coding prototypes, extensions, and additional deep dives for enhanced learning are readily available for any..." Read more

Customers find the book worth every penny.

"...You can modify the code easily and learn a lot. Imho this is very good investment for anyone who wants to learn how LLM work" Read more

"From the steps I have taken so far with this book, it is very valuable for anyone looking to start off with LLMs...." Read more

"...(worth every penny). Especially the exercises and appendix (GitHub repo too)." Read more

"...It's been a good time investment!" Read more

Customers appreciate the attention mechanisms in the book.

"...life-cycle by building layer-by-layer transformers, playing around with attention mechanisms, pre-training pipelines, fine-tuning for the most..." Read more

"...The book also includes example code covering byte pair encoding, attention mechanisms, and even direct preference optimization...." Read more

"Fantastic book for a beginner. Attention mechanism and other complex constructs explained in an easy to understand illustrated manner without..." Read more

Submit a report

A few common reasons customers report reviews:
  • Harassment, profanity
  • Spam, advertisement, promotions
  • Given in exchange for cash, discounts
When we get your report, we'll check if the review meets our Community guidelines. If it doesn't, we'll remove it.
Sorry we couldn't load the review
Thank you for your feedback

Sorry, there was an error

Please try again later.

Top reviews from the United States

  • 5.0 out of 5 starsVerified Purchase
    Excellent book, with great code, a must read!
    Reviewed in the United States on June 4, 2025
    Format: Kindle
    This is a very good book. I recommend to do the code exercises along reading. The author provides all the code, and it's easy to follow in notebooks to really see what is happening. You can modify the code easily and learn a lot. Imho this is very good investment for...
    This is a very good book. I recommend to do the code exercises along reading. The author provides all the code, and it's easy to follow in notebooks to really see what is happening. You can modify the code easily and learn a lot. Imho this is very good investment for anyone who wants to learn how LLM work
    One person found this helpful
  • 5.0 out of 5 starsVerified Purchase
    So concise
    Reviewed in the United States on May 26, 2025
    Format: Paperback
    This review may be pre mature because I’ve only made it through the first two chapters but so far absolutely amazing. The language is perfect. So many concepts that I’ve struggled with for a while are laid out so clearly. I look forward to doing all the exercises and...
    This review may be pre mature because I’ve only made it through the first two chapters but so far absolutely amazing. The language is perfect. So many concepts that I’ve struggled with for a while are laid out so clearly. I look forward to doing all the exercises and finishing this book But I would just like to thank the author personally because this is a game changer for my understanding of General ML and AI concepts I struggled with in the past.
    4 people found this helpful
  • 5.0 out of 5 starsVerified Purchase
    Excellent!
    Reviewed in the United States on January 13, 2025
    Format: Kindle
    I'm still reading the book, and completed coding everything in Chapter 2. So far the approach of breaking down the concepts into fundamental parts and then showing how those parts are built into more complex implementations - that can then be better understood because...
    I'm still reading the book, and completed coding everything in Chapter 2. So far the approach of breaking down the concepts into fundamental parts and then showing how those parts are built into more complex implementations - that can then be better understood because of the author's presentation is perfect for how I learn.
    For the benefit of others with NVIDIA GPU configuring CUDA:
    1 Find the CUDA support level of your GPU - on Windows NVIDIA Control Panel -> System Information(at the bottom) -> Components tab - installed driver software SUPPORT LEVEL is listed - Not the actual software!
    2 Install MS Visual Studio (2022) Needed by NVIDIA CUDA software
    3 Install the version of the NVIDIA CUDA software supported by info from step 1 AND PyTorch (for example my HW supported CUDA up to 12.7, but PyTorch software support tops out at 12.4 (as of today 1/13/2025), so I went with 12.4 NVIDIA CUDA software.
    4. During the driver custom install (not the default simplified install) deselect NVIDIA GeForce Experience - caused errors for me
    5. Reboot after NVIDIA CUDA software installation
    6. On the NVIDIA CUDA installation page there are deviceQuery and bandwidthTest exe's that will validate the CUDA HW/SW interface is functioning
    7. Run the PyTorch installer - I use Anaconda environment- so ran the conda install command coped from the PyTorch installation web page (shown in the book), from a command line inside my conda target environment - restart anaconda, I use vs code restarted both when the install was completed
    8. On the NVIDIA CUDA installation page it states to install - conda install cuda -c nvidia to the conda target environment - when the book says run - torch.cuda.is_available() it should return True
    I don't consider this a defect of the book - there is already enough hand-holding by the author - imho some work still needs to be done by the reader!!
    So far getting a great appreciation/comprehension about what is behind Large Lanquage Models - Thank You!!
    5 people found this helpful
  • 5.0 out of 5 starsVerified Purchase
    Very Informative-- definite extra buy!
    Reviewed in the United States on May 27, 2025
    Format: Paperback
    As an Undergraduate in Intelligent Systems Engineering, this book is amazing. definitely had some good points not covered in classes!
    One person found this helpful
  • 4.0 out of 5 starsVerified Purchase
    I wish it was coloured printing
    Reviewed in the United States on February 24, 2025
    Format: Paperback
    I appreciated the book for its thoroughness and attention to detail. However, I believe it would benefit from being printed in color, as many images on the O'Reilly website are more vibrant and clearer when viewed in color. Additionally, enhancing the resolution of some...
    I appreciated the book for its thoroughness and attention to detail. However, I believe it would benefit from being printed in color, as many images on the O'Reilly website are more vibrant and clearer when viewed in color. Additionally, enhancing the resolution of some images would improve the overall experience. For these reasons, I would rate the book 4 out of 5. With these adjustments, I think it could easily earn a perfect score of 5 out of 5.
  • 5.0 out of 5 starsVerified Purchase
    One of the best technical books I've ever purchased
    Reviewed in the United States on March 20, 2025
    Format: Paperback
    I've bought tons of ML, DE, programming, cloud architecture books, etc... This book is absolutely fantastic! Especially combined by the current YouTube series published by the author (March 2025). Sebastian's Packt books are also excellent but I must...
    I've bought tons of ML, DE, programming, cloud architecture books, etc...
    This book is absolutely fantastic! Especially combined by the current YouTube series published by the author (March 2025).

    Sebastian's Packt books are also excellent but I must say this book stands on its own. This book is extremely well written and clear, builds each component in the Transformer Architecture piece by piece, it makes me feel like I can actually build an LLM on my own.

    At a minimum this book will help you understand the Transformer Architecture (Attention Mechanism, Feed Forward, Layer Norm, etc...) rather than importing models from HugginFace and not really know what's going on in the background.

    If you are like me and are not satisfied with just building RAGs/LLM applications without understanding the model architecture, this book is for you!

    I'll keep buying from this author as long as the quality of his content is as good as this.
    10 people found this helpful
  • 5.0 out of 5 starsVerified Purchase
    Excellent
    Reviewed in the United States on March 1, 2025
    Format: Paperback
    This book shows step by step all ingredients which are put together in order to build a GPT-2 model from scratch. All functions are explained explicitely in python, before the equivalent functions of pytorch are used. I really liked to follow the book to the end....
    This book shows step by step all ingredients which are put together in order to build a GPT-2 model from scratch. All functions are explained explicitely in python, before the equivalent functions of pytorch are used. I really liked to follow the book to the end.

    There is also a discussion forum about the book on github, where readers can ask questions, which are promptly answered by the author.

    That said, there remain many questions about WHY the method works, and why some steps are made. E.g. why use multihead attention: to my understanding this completely scrambles the embedding vectors, and it is like a miracle that the method works so well. But there were page limits for the book, and and going deeper into this kind of questions would pprobably have doubled the size of the book.
    One person found this helpful
  • 5.0 out of 5 starsVerified Purchase
    Excellent book that teaches LLMs by building one
    Reviewed in the United States on March 12, 2025
    Format: Paperback
    The best way to learn something is to build it for yourself, and that is exactly what this book does for LLMs. You can get explanations of how LLMs work from a lot of sites on the Internet. What this book does uniquely (as far as I know) is combine that information with a...
    The best way to learn something is to build it for yourself, and that is exactly what this book does for LLMs. You can get explanations of how LLMs work from a lot of sites on the Internet. What this book does uniquely (as far as I know) is combine that information with a guide for you to implement it for yourself. If you finish the book and work through the code examples and exercises, you will have a solid and up-to-date understanding of how LLMs work under the hood.
    2 people found this helpful

Top reviews from other countries

  • Christian W.
    5.0 out of 5 starsVerified Purchase
    Gut strukturiert und gut verständlich
    Reviewed in Germany on April 27, 2025
    LLMs sind ein neues Thema für mich. Das Buch finde ich wirklich gut, sauber strukturiert und am Ende versteht man das Thema. Es gibt zusätzliche Videos des Authors, in denen er die Kapitel noch einmal durchgeht und auf github gutes Zusatzmaterial (u.a. Jupyter Notebooks zum...
    LLMs sind ein neues Thema für mich. Das Buch finde ich wirklich gut, sauber strukturiert und am Ende versteht man das Thema. Es gibt zusätzliche Videos des Authors, in denen er die Kapitel noch einmal durchgeht und auf github gutes Zusatzmaterial (u.a. Jupyter Notebooks zum Thema).
    LLMs sind ein neues Thema für mich. Das Buch finde ich wirklich gut, sauber strukturiert und am Ende versteht man das Thema. Es gibt zusätzliche Videos des Authors, in denen er die Kapitel noch einmal durchgeht und auf github gutes Zusatzmaterial (u.a. Jupyter Notebooks zum Thema).
  • dioj3828
    5.0 out of 5 starsVerified Purchase
    One of the best books about LLM. Must have!
    Reviewed in Canada on March 21, 2025
    this an excellent book with clear explanations on difficult topics. I want to outline what differentiates this book from similar: 1. No general stuff that is available on every AI related website. 2. Very clear mapping of math to the rationale behind this math. 3. A lot of...
    this an excellent book with clear explanations on difficult topics. I want to outline what differentiates this book from similar: 1. No general stuff that is available on every AI related website. 2. Very clear mapping of math to the rationale behind this math. 3. A lot of diagrams showing the mechanics of different operations. 4. A LOT of references to useful academic papers. 5. Great youtube companion videos available.
    this an excellent book with clear explanations on difficult topics. I want to outline what differentiates this book from similar:

    1. No general stuff that is available on every AI related website.
    2. Very clear mapping of math to the rationale behind this math.
    3. A lot of diagrams showing the mechanics of different operations.
    4. A LOT of references to useful academic papers.
    5. Great youtube companion videos available.
  • Amazon Customer
    5.0 out of 5 starsVerified Purchase
    Excellent book
    Reviewed in France on January 17, 2025
    Excellent book
    Excellent book
  • Adam N.
    5.0 out of 5 starsVerified Purchase
    Amazing book!
    Reviewed in the United Kingdom on May 5, 2025
    Truly great and exceptionally well thought out resource for learning how LLMs work. Don’t even think twice and read it, study the great examples, look up bonus materials and check out accompanying YT videos from S. Raschka. I wish all tech books were at this quality level!
    Truly great and exceptionally well thought out resource for learning how LLMs work. Don’t even think twice and read it, study the great examples, look up bonus materials and check out accompanying YT videos from S. Raschka. I wish all tech books were at this quality level!
    Truly great and exceptionally well thought out resource for learning how LLMs work. Don’t even think twice and read it, study the great examples, look up bonus materials and check out accompanying YT videos from S. Raschka. I wish all tech books were at this quality level!
  • luc beeusaert
    5.0 out of 5 starsVerified Purchase
    Goed
    Reviewed in Belgium on January 2, 2025
    Zeer goed om zelf je eerste LLM te leren maken.
    Zeer goed om zelf je eerste LLM te leren maken.

How customer reviews and ratings work

Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.Learn more how customers reviews work on Amazon