Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Generative AI with Python and PyTorch
  • Table Of Contents Toc
  • Feedback & Rating feedback
Generative AI with Python and PyTorch

Generative AI with Python and PyTorch - Second Edition

By : Joseph Babcock, Raghav Bali
5 (1)
close
close
Generative AI with Python and PyTorch

Generative AI with Python and PyTorch

5 (1)
By: Joseph Babcock, Raghav Bali

Overview of this book

Become an expert in Generative AI through immersive, hands-on projects that leverage today’s most powerful models for Natural Language Processing (NLP) and computer vision. Generative AI with Python and PyTorch is your end-to-end guide to creating advanced AI applications, made easy by Raghav Bali, a seasoned data scientist with multiple patents in AI, and Joseph Babcock, a PhD and machine learning expert. Through business-tested approaches, this book simplifies complex GenAI concepts, making learning both accessible and immediately applicable. From NLP to image generation, this second edition explores practical applications and the underlying theories that power these technologies. By integrating the latest advancements in LLMs, it prepares you to design and implement powerful AI systems that transform data into actionable intelligence. You’ll build your versatile LLM toolkit by gaining expertise in GPT-4, LangChain, RLHF, LoRA, RAG, and more. You’ll also explore deep learning techniques for image generation and apply styler transfer using GANs, before advancing to implement CLIP and diffusion models. Whether you’re generating dynamic content or developing complex AI-driven solutions, this book equips you with everything you need to harness the full transformative power of Python and AI.
Table of Contents (19 chapters)
close
close
17
Other Books You May Enjoy
18
Index

The rules of probability

At the simplest level, a model, be it machine learning or a more classical method such as linear regression, is a mathematical description of how a target variable changes in response to variation in a predictive variable; that relationship could be a linear slope or any of a number of more complex mathematical transformations. In the task of modeling, we usually think of separating the variables in our dataset into two broad classes:

  • Independent data, by which we primarily mean inputs to a model, is often denoted by X. For example, if we are trying to predict the grades of school students on an end-of-year exam based on their characteristics, we could think of several kinds of features:
    • Categorical: If there are six schools in a district, the school that a student attends could be represented by a six-element vector for each student. The elements are all 0, except for one that is 1, indicating which of the six schools they are enrolled in.
    • Continuous: The student heights or average prior test scores can be represented as continuous real numbers.
    • Ordinal: The rank of the student in their class is not meant to be an absolute quantity (like their height) but rather a measure of relative difference.
  • Dependent variables, conversely, are the outputs of our models and are denoted by the letter Y. Note that, in some cases, Y is a “label” that can be used to condition a generative output, such as in a conditional GAN. It can be categorical, continuous, or ordinal, and could be an individual element or multidimensional matrix (tensor) for each element of the dataset.

How can we describe the data in our model using statistics? In other words, how can we quantitatively describe what values we are likely to see, how frequently, and which values are more likely to appear together and others? One way is by asking how likely it is to observe a particular value in the data or the probability of that value. For example, if we were to ask what the probability of observing a roll of four on a six-sided die is, the answer is that, on average, we would observe a four once every six rolls. We write this as follows:

P(X=4) = 1⁄6 = 16.67%

Here, P denotes “probability of.” What defines the allowed probability values for a particular dataset? If we imagine the set of all possible values of a dataset—such as all values of a die—then a probability maps each value to a number between 0 and 1. The minimum is 0 because we cannot have a negative chance of seeing a result; the most unlikely result is that we would never see a particular value, or 0% probability, such as rolling a seven on a six-sided die. Similarly, we cannot have a greater than 100% probability of observing a result, represented by the value 1; an outcome with probability 1 is absolutely certain. This set of probability values associated with a dataset belongs to discrete classes (such as the faces of a die) or an infinite set of potential values (such as variations in height or weight). In either case, however, these values have to follow certain rules, the probability axioms described by the mathematician Andrey Kolmogorov in 193314:

  1. The probability of an observation (a die roll, a particular height) is a non-negative, finite number between 0 and 1.
  2. The probability of at least one of the observations in the space of all possible observations occurring is 1.
  3. The probability of distinct, mutually exclusive events (such as the rolls 1-6 on a die) is the sum of the probability of the individual events.

While these rules might seem abstract, we will see in Chapter 3 that they have direct relevance to developing neural network models. For example, an application of rule 1 is to generate the probability between 1 and 0 for a particular outcome in a softmax function for predicting target classes. For example, if our model is asked to classify whether an image contains a cat, dog, or horse, each potential class receives a probability between 0 and 1 as the output of a sigmoid function based on a deep neural network applying nonlinear, multi-layer transformations on the input pixels of an image we are trying to classify. Rule 3 is used to normalize these outcomes into the range 0–1, under the guarantee that they are mutually distinct predictions of a deep neural network (in other words, a real-world image logically cannot be classified as both a dog and cat, but rather a dog or cat, with the probability of these two outcomes additive). Finally, the second rule provides the theoretical guarantees that we can generate data at all using these models.

However, in the context of machine learning and modeling, we are not usually interested in just the probability of observing a piece of input data, X; we instead want to know the conditional probability of an outcome Y given the data X. Said another way, we want to know how likely a label for a set of data is, based on that data. We write this as the probability of Y given X, or the probability of Y conditional on X:

P(Y|X)

Another question we could ask about Y and X is how likely they are to occur together—their joint probability—which can be expressed using the preceding conditional probability expression as:

P(X, Y) = P(Y|X)P(X) = P(X|Y)P(Y)

This formula expressed the probability of X and Y. In the case of X and Y being completely independent of one another, this is simply their product:

P(X|Y)P(Y) = P(Y|X)P(X) = P(X)P(Y)

You will see that these expressions become important in our discussion of complementary priors in Chapter 4, and the ability of restricted Boltzmann machines to simulate independent data samples. They are also important as building blocks of Bayes’ theorem, which we describe next.

Visually different images
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Generative AI with Python and PyTorch
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon