diff options
| author | G. Jay Kerns <[email protected]> | 2020-12-14 23:12:26 -0500 |
|---|---|---|
| committer | G. Jay Kerns <[email protected]> | 2020-12-14 23:12:26 -0500 |
| commit | 95a3c042776587d7779c82da7763ed39b7046a79 (patch) | |
| tree | 1cbc7dc6f94f0ec3df01cf132afa72bb3f28b766 | |
| parent | 9b4054c8fa72c64eba362977c04e00acd4509d93 (diff) | |
| download | ipsur-master.tar.gz | |
| -rw-r--r-- | pkg/vignettes/IPSUR.Rnw | 188 |
1 files changed, 122 insertions, 66 deletions
diff --git a/pkg/vignettes/IPSUR.Rnw b/pkg/vignettes/IPSUR.Rnw index 0a40b5a..e9875ba 100644 --- a/pkg/vignettes/IPSUR.Rnw +++ b/pkg/vignettes/IPSUR.Rnw @@ -69,8 +69,9 @@ %\geometry{verbose,tmargin=1in,bmargin=1in,outer=1.25in,inner=0.75in} %\geometry{paperwidth=6in,paperheight=9in, margin=0.55in} %\geometry{paperwidth=6in,paperheight=9in, margin=0.75in} -\usepackage[paperwidth=6in,paperheight=9in,tmargin=0.75in,bmargin=0.8in,outer=0.5in,inner=0.75in]{geometry} +%\usepackage[paperwidth=6in,paperheight=9in,tmargin=0.75in,bmargin=0.8in,outer=0.5in,inner=0.75in]{geometry} %\geometry{paperwidth=6in,paperheight=9in,tmargin=0.75in,bmargin=0.8in,outer=0.5in,inner=0.75in} +\usepackage[paperwidth=7.25in,paperheight=10.25in,tmargin=0.75in,bmargin=0.8in,outer=0.5in,inner=0.75in]{geometry} %\geometry{paperwidth=7in,paperheight=10in,tmargin=0.75in,bmargin=0.75in,outer=0.5in,inner=0.75in} \pagestyle{headings} \setcounter{secnumdepth}{2} @@ -2418,10 +2419,7 @@ some hidden structure to the data. \subsubsection{How to do it with \textsf{R}} -The quickest way to visually identify outliers is with a boxplot, -described above. Another way is with the \texttt{boxplot.stats} function. - - +The quickest way to visually identify outliers is with a boxplot, described above. Another way is with the \texttt{boxplot.stats} function. \begin{example}[Lengths of Major North American Rivers] We will look for potential outliers in the \texttt{rivers} data. @@ -2552,21 +2550,49 @@ output, of course, but it was Farquar and Farquar in who rightly said that ``Getting information from a table is like extracting sunbeams from a cucumber.'' We try visual displays to learn about the data. -<<>>= +<<echo=TRUE, eval=FALSE>>= +barplot(A, legend.text = TRUE, args.legend = list(x="topleft")) +barplot(A, legend.text = TRUE, beside = TRUE, args.legend = list(x="topleft")) +@ + + +<<barplots-Titanic, echo=FALSE, fig=TRUE, include=FALSE, height=4,width=6.5>>= par(mfrow=c(1,2)) barplot(A, legend.text = TRUE, args.legend = list(x="topleft")) barplot(A, legend.text = TRUE, beside = TRUE, args.legend = list(x="topleft")) par(mfrow=c(1,1)) @ +\begin{figure} +\begin{center} +\includegraphics{IPSUR-barplots-Titanic} +\end{center} +\vspace{-0.5in} +\caption[Bar plots of the \texttt{Titanic} data.]{{\small Bar plots of the \texttt{Titanic} data.}} +\label{fig:barplots-Titanic} +\end{figure} + +<<echo=TRUE, eval=FALSE>>= +spineplot(A) +mosaicplot(A) +@ + -<<>>= +<<spineplot-Titanic, echo=FALSE, fig=TRUE, include=FALSE, height=4,width=6.5>>= par(mfrow=c(1,2)) spineplot(A) mosaicplot(A) par(mfrow=c(1,1)) @ +\begin{figure} +\begin{center} +\includegraphics{IPSUR-spineplot-Titanic} +\end{center} +\vspace{-0.5in} +\caption[Spine and Mosaic plots of the \texttt{Titanic} data.]{{\small Spine and Mosaic plots of the \texttt{Titanic} data.}} +\label{fig:spineplot-Titanic} +\end{figure} \subsubsection{Quantitative versus Quantitative} @@ -2574,7 +2600,7 @@ par(mfrow=c(1,1)) Two quantitative variables are usually displayed with some sort of scatter plot. \begin{example}[Reaction Velocity of an Enzymatic Reaction] -The \texttt{Puromycin} data records the reaction velocity (\texttt{rate}) versus substrate concentration (\texttt{conc}) in an enzymatic reaction involving untreated cells or cells treated with Puromycin. A scatterplot of the two variables is in Figure BLANK. We see that rate increases as concentration increases, and in a nonlinear fashion. +The \texttt{Puromycin} data records the reaction velocity (\texttt{rate}) versus substrate concentration (\texttt{conc}) in an enzymatic reaction involving untreated cells or cells treated with Puromycin. A scatterplot of the two variables is in Figure~ref{}. We see that rate increases as concentration increases, and in a nonlinear fashion. \end{example} \begin{example}[The Joyner–Boore Attenuation Data] @@ -2634,7 +2660,7 @@ par(mfrow=c(1,1)) \subsubsection{Qualitative versus Qualitative} -We will talk about this more in Section \@ref{sec:comparing-data-sets}. +We will talk about this more in Section~\ref{sec:comparing-data-sets}. \subsection{Multivariate Data} \label{sub:multivariate-data} @@ -2642,12 +2668,14 @@ Multivariate Data Display \begin{itemize} \item Multi-Way Tables. You can do this with \texttt{table}, or in \textsf{R} - Commander by following \texttt{Statistics} \(\triangleright\) \texttt{Contingency - Tables} \(\triangleright\) \texttt{Multi-way Tables}. -\item Scatterplot matrix. used for displaying pairwise scatterplots - simultaneously. Again, look for linear association and correlation. +Commander by following \texttt{Statistics} \(\triangleright\) \texttt{Contingency Tables} \(\triangleright\) \texttt{Multi-way Tables}. + +\item Scatterplot matrix. used for displaying pairwise scatterplots simultaneously. Again, look for linear association and correlation. + \item 3D Scatterplot. See Figure~\ref{fig:3D-scatterplot-trees} + \item \texttt{plot(state.region, state.division)} + \item \texttt{barplot(table(state.division,state.region), legend.text=TRUE)} \end{itemize} @@ -2665,6 +2693,7 @@ z mosaicplot(z, main = "Relation between eye color and sex") @ + \section{Comparing Populations} \label{sec:comparing-data-sets} Sometimes we have data from two or more groups (or populations) and we @@ -2944,7 +2973,7 @@ Case in point: in 2015, the Planned Parenthood organization was the target of a highly publicized investigation after a series of videos were released to the media that allegedly showed Planned Parenthood executives discussing potential illicit sale of fetal tissues. The President of Planned Parenthood, Cecile Richards, testified before a House -Oversight Committee as part of the investigation. Rep. Jason Chaffetz +Oversight Committee as part of the investigation. At the time, Rep. Jason Chaffetz (R-UT) was the chair of the committee, and near the end of his inquiry, he presented the following graph to Richards and asked her questions about it. (See Figure~\ref{fig:PPb}.) @@ -3174,15 +3203,26 @@ and descriptive statistics. - -<<echo = FALSE>>= +<<histexr, echo=FALSE, fig=TRUE, include=FALSE, height=3.5,width=5>>= m = sample(25:95, size = 1) x = rexp(500, rate = m) hist(x, main = "", breaks = 15, xlab="") @ +\begin{figure} +\begin{center} +\includegraphics{IPSUR-histexr} +\end{center} +\caption{{\small Some data of interest to a researcher.}} +\label{fig:histexr} +\end{figure} + + +<<echo = FALSE>>= +@ + \begin{Exercise}[] -The data graphed above represent measurements on some quantitative continuous +The data graphed in Figure~\ref{fig:histexr} represent measurements on some quantitative continuous variable of interest to a researcher. \begin{enumerate} @@ -9214,18 +9254,24 @@ f_{Y|x}(y|x)=\frac{f_{X,Y}(x,y)}{f_{X}(x)},\quad y\in S_{Y}. \end{equation} We define \(f_{X|y}\) in a similar fashion. -BLANK - - - \begin{example}[] Let the joint PMF of \(X\) and \(Y\) be given by \[ -f_{X,Y}(x,y) = x + y,\ 0\leq x \leq 1, \ 0 \leq y \leq 1. +f_{X,Y}(x,y) = x + y,\ 0 \leq x \leq 1, \ 0 \leq y \leq 1. \] Then the marginal PDF of \(X\) is: +\[ +f_{X}(x) = \int_{0}^{1} \left(x + y\right)\,\mathrm{d}y =\left. xy + y^{2} \right|_{x=0}^{1} = x + 1, +\] +for any \(0 < x < 1\). Let's fix \(x = 0.5\). Then the conditional PDF of $Y$ given that $X = 0.5$ will be +\[ +f_{Y|0.5}(y \vert 0.5) = \frac{f_{X,Y}(0.5, y)}{f_{X}(0.5)} = \frac{0.5 + y}{0.5 + 1} = \frac{2}{3}\left(0.5 + y\right), +\] +for all $0< y < 1$. + \end{example} + \subsection{Bayesian Connection} Conditional distributions play a fundamental role in Bayesian @@ -9886,15 +9932,9 @@ Theorem~\ref{thm:mvnorm-dist-matrix-prod}. \subsubsection{How to do it with \textsf{R}} \label{sub:bivariate-transf-r} -It is possible to do the computations above in \textsf{R} with the -\texttt{Ryacas} package. The package is an interface to the open-source -computer algebra system, ``Yacas''. The user installs Yacas, then -employs \texttt{Ryacas} to submit commands to Yacas, after which the output -is displayed in the \textsf{R} console. +It is possible to do the computations above in \textsf{R} with the \texttt{Ryacas} package. The package is an interface to the open-source computer algebra system, ``Yacas''. The user installs Yacas, then employs \texttt{Ryacas} to submit commands to Yacas, after which the output is displayed in the \textsf{R} console. -There are not yet any examples of Yacas in this book, but there are -online materials to help the interested reader: see -\url{http://code.google.com/p/ryacas/} to get +There are not yet any examples of Yacas in this book, but there are online materials to help the interested reader: see \url{http://r-cas.github.io/ryacas/} to get started. \section{Remarks for the Multivariate Case} \label{sec:remarks-for-the-multivariate} @@ -10478,7 +10518,7 @@ consult Casella and Berger \cite{Casella2002}, or Hogg \textit{et al} \end{proof} -\subsection{The Distribution of Student's \(t\) Statistic} \label{sub:students-t-distribution} +\subsection{The Distribution of Student's \(t\) Statistic} \label{sub:student-t-distribution} \begin{prop}[] @@ -11213,6 +11253,7 @@ curve(x^3*(1-x)^4, 0, 1, add = TRUE) \begin{center} \includegraphics{IPSUR-fishing-part-two} \end{center} +\vspace{-0.35in} \caption[Assorted likelihood functions for fishing, part two.]{{\small Assorted likelihood functions for fishing, part two. Three graphs are shown of \(L\) when \(\sum x_{i}\) equals 3, 4, and 5, respectively, from left to right. We pick an \(L\) that matches the observed data and then maximize \(L\) as a function of \(p\). If \(\sum x_{i}=4\), then the maximum appears to occur somewhere around \(p \approx 0.6\).}} \label{fig:fishing-part-two} \end{figure} @@ -11272,6 +11313,7 @@ text(mle, mleobj/4, substitute(hat(theta)==a, list(a=round(mle, 4))), cex = 1.3, \begin{center} \includegraphics{IPSUR-species-mle} \end{center} +\vspace{-0.35in} \caption[Species maximum likelihood.]{{\small Species maximum likelihood. Here we see that \(\hat{\theta} \approx \Sexpr{round(mean(dat),2)}\) is the @@ -11385,9 +11427,9 @@ where \(\hat{\mu}=\overline{X}\) and We of course know from \@ref{pro:mean-sd-xbar} that \(\hat{\mu}\) is unbiased. What about \(\hat{\sigma^{2}}\)? Let us check: \begin{eqnarray*} -\mathbb{E}\,\hat{\sigma^{2}} & = & \mathbb{E}\,\frac{n-1}{n}S^{2}\\ - & = & \mathbb{E}\left(\frac{\sigma^{2}}{n}\frac{(n-1)S^{2}}{\sigma^{2}}\right)\\ - & = & \frac{\sigma^{2}}{n}\mathbb{E}\ \mathsf{chisq}(\mathtt{df}=n-1)\\ +\mathbb{E}\,\hat{\sigma^{2}} & = & \mathbb{E}\,\frac{n-1}{n}S^{2},\\ + & = & \mathbb{E}\left(\frac{\sigma^{2}}{n}\frac{(n-1)S^{2}}{\sigma^{2}}\right),\\ + & = & \frac{\sigma^{2}}{n}\mathbb{E}\ \mathsf{chisq}(\mathtt{df}=n-1),\\ & = & \frac{\sigma^{2}}{n}(n-1), \end{eqnarray*} from which we may conclude two things: @@ -19866,7 +19908,8 @@ also has a \texttt{write.matrix} function. \texttt{stack} -\chapter{Mathematical Machinery} \label{cha:mathematical-machinery} +\chapter{Mathematical Machinery} +\label{cha:mathematical-machinery} This appendix houses many of the standard definitions and theorems that are used at some point during the narrative. It is targeted for @@ -19880,8 +19923,8 @@ Folland \cite{Folland1999}, or Carothers \cite{Carothers2000}), or Measure Theory (Billingsley \cite{Billingsley1995}, Ash \cite{Ash2000}, Resnick \cite{Resnick1999}) for details. -\section{Set Algebra} \label{sec:the-algebra-of} - +\section{Set Algebra} +\label{sec:the-algebra-of} We denote sets by capital letters, \(A\), \(B\), \(C\), \textit{etc}. The letter \(S\) is reserved for the sample space, also known as the @@ -19911,7 +19954,6 @@ Complement & $A^{c}$ & in $S$ but not in $A$ & \texttt{setdiff( \label{tab:set-operations} \end{table} - \subsection{Identities and Properties} \begin{enumerate} @@ -19994,12 +20036,20 @@ if \(f'(a)\) exists. It is \textit{differentiable on an open interval} In the table that follows, \(f\) and \(g\) are differentiable functions and \(c\) is a constant. -(ref:tab-differentiation-rules) - -Table: Differentiation rules. - -| \(\frac{\mathrm{d}}{\mathrm{d} x}c=0\) | \(\frac{\mathrm{d}}{\mathrm{d} x}x^{n}=nx^{n-1}\) | \((cf)'=cf'\) | -| \((f\pm g)'=f'\pm g'\) | \((fg)'=f'g+fg'\) | \(\left(\frac{f}{g}\right)'=\frac{f'g-fg'}{g^{2}}\) | +\begin{table}[H] +\begin{centering} +\begin{tabular}{|c|c|c|} +\hline + & & \tabularnewline +$\frac{\mathrm{d}}{\mathrm{d} x}c=0$ & $\frac{\mathrm{d}}{\mathrm{d} x}x^{n}=nx^{n-1}$ & $(cf)'=cf'$\tabularnewline + & & \tabularnewline +$(f\pm g)'=f'\pm g'$ & $(fg)'=f'g+fg'$ & $\left(\frac{f}{g}\right)'=\frac{f'g-fg'}{g^{2}}$\tabularnewline + & & \tabularnewline +\hline +\end{tabular} +\par\end{centering} +\caption{Differentiation rules\textbf{\label{tab:differentiation-rules}}} +\end{table} @@ -20011,13 +20061,20 @@ differentiable and \(F'(x) = f'[ g(x) ] \cdot g'(x)\). \subsubsection{Useful Derivatives} -(ref:tab-useful-derivatives) - -Table: Some derivatives. - -| \(\frac{\mathrm{d}}{\mathrm{d} x}\mathrm{e}^{x}=\mathrm{e}^{x}\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\ln x=x^{-1}\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\sin x=\cos x\) | -| \(\frac{\mathrm{d}}{\mathrm{d} x}\cos x=-\sin x\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\tan x=\sec^{2}x\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\tan^{-1}x=(1+x^{2})^{-1}\) | -| | | | +\begin{table}[H] +\begin{centering} +\begin{tabular}{|c|c|c|} +\hline + & & \tabularnewline +$\frac{\mathrm{d}}{\mathrm{d} x}\mathrm{e}^{x}=\mathrm{e}^{x}$ & $\frac{\mathrm{d}}{\mathrm{d} x}\ln x=x^{-1}$ & $\frac{\mathrm{d}}{\mathrm{d} x}\sin x=\cos x$\tabularnewline + & & \tabularnewline +$\frac{\mathrm{d}}{\mathrm{d} x}\cos x=-\sin x$ & $\frac{\mathrm{d}}{\mathrm{d} x}\tan x=\sec^{2}x$ & $\frac{\mathrm{d}}{\mathrm{d} x}\tan^{-1}x=(1+x^{2})^{-1}$$ $\tabularnewline + & & \tabularnewline +\hline +\end{tabular} +\par\end{centering} +\caption{Some derivatives\textbf{\label{tab:useful-derivatives}}} +\end{table} \subsection{Optimization} @@ -20027,7 +20084,6 @@ which \(f'(x^{\ast})=0\) or for which \(f'(x^{\ast})\) does not exist. \end{defn} - \begin{thm}[First Derivative Test] \label{thm:first-derivative-test} If \(f\) is differentiable and if @@ -20101,22 +20157,26 @@ If \(g\) is a differentiable function whose range is the interval \subsubsection{Useful Integrals} -(ref:tab-useful-integrals) - -Table: Some integrals (constants of integration omitted). - -| \(\int x^{n}\,\mathrm{d} x=x^{n+1}/(n+1),\ n \neq - 1\) | \(\int\mathrm{e}^{x}\,\mathrm{d} x=\mathrm{e}^{x}\) | \(\int x^{-1}\,\mathrm{d} x=\ln \mathrm{abs}(x) \) | -| \(\int\tan x\:\mathrm{d} x=\ln \mathrm{abs}(\sec x)\) | \(\int a^{x}\,\mathrm{d} x=a^{x}/\ln a\) | \(\int(x^{2}+1)^{-1}\,\mathrm{d} x=\tan^{-1}x\) | - +\begin{table}[H] +\begin{centering} +\begin{tabular}{|c|c|c|} +\hline + & & \tabularnewline +$\int x^{n}\,\mathrm{d} x=x^{n+1}/(n+1),\ n\neq-1$ & $\int\mathrm{e}^{x}\,\mathrm{d} x=\mathrm{e}^{x}$ & $\int x^{-1}\,\mathrm{d} x=\ln|x|$\tabularnewline + & & \tabularnewline +$\int\tan x\:\mathrm{d} x=\ln|\sec x|$ & $\int a^{x}\,\mathrm{d} x=a^{x}/\ln a$ & $\int(x^{2}+1)^{-1}\,\mathrm{d} x=\tan^{-1}x$\tabularnewline + & & \tabularnewline +\hline +\end{tabular} +\par\end{centering} +\caption{Some integrals (constants of integration omitted)\textbf{\label{tab:useful-integrals}}} +\end{table} \subsubsection{Integration by Parts} \begin{equation} \int u\:\mathrm{d} v=uv-\int v\:\mathrm{d} u \end{equation} - - - \begin{thm}[L'H\^ opital's Rule] Suppose \(f\) and \(g\) are differentiable and \(g'(x)\neq0\) near \(a\), except possibly at \(a\). Suppose that the limit @@ -20390,8 +20450,6 @@ compute the determinant; the final result is independent of the \(j\) chosen. \end{defn} - - \begin{fact}[] The determinant of the \(2\times2\) matrix \begin{equation} @@ -20400,8 +20458,6 @@ c & d\end{bmatrix}\quad \mbox{is} \quad |\mathbf{A}|=ad-bc. \end{equation} \end{fact} - - \begin{fact}[] A square matrix \(\mathbf{A}\) is nonsingular if and only if \(\mathrm{det}(\mathbf{A})\neq0\). |
