1. Derivatives (Differentiation)
The derivative measures the rate of change of a function. It's the foundation of backpropagation in neural networks.
Basic Derivative Rules
Function | Derivative |
---|---|
c (constant) | 0 |
x | 1 |
x^n | n·x^(n-1) |
e^x | e^x |
a^x | a^x · ln(a) |
ln(x) | 1/x |
log_a(x) | 1/(x·ln(a)) |
sin(x) | cos(x) |
cos(x) | -sin(x) |
tan(x) | sec²(x) |
cot(x) | -csc²(x) |
sec(x) | sec(x)·tan(x) |
csc(x) | -csc(x)·cot(x) |
Differentiation Rules
d/dx[f + g] = f' + g'
d/dx[f - g] = f' - g'
d/dx[f·g] = f'·g + f·g'
d/dx[f/g] = (f'·g - f·g')/g²
d/dx[f(g(x))] = f'(g(x))·g'(x)
d/dx[x^n] = n·x^(n-1)
Examples
2. Integrals (Integration)
Integration is the reverse of differentiation. It finds the area under a curve and is used in probability distributions and optimization.
Basic Integral Rules
Function | Integral |
---|---|
k (constant) | kx + C |
x^n (n ≠ -1) | x^(n+1)/(n+1) + C |
1/x | ln|x| + C |
e^x | e^x + C |
a^x | a^x/ln(a) + C |
sin(x) | -cos(x) + C |
cos(x) | sin(x) + C |
sec²(x) | tan(x) + C |
csc²(x) | -cot(x) + C |
sec(x)·tan(x) | sec(x) + C |
1/√(1-x²) | arcsin(x) + C |
1/(1+x²) | arctan(x) + C |
Integration Techniques
∫ f(g(x))·g'(x)dx = ∫ f(u)du
where u = g(x)∫ u·dv = uv - ∫ v·du
Examples
3. Limits
Limits describe the behavior of functions as they approach a particular point. Essential for understanding derivatives and continuity.
Basic Limits
L'Hôpital's Rule
For indeterminate forms (0/0 or ∞/∞):
Example:
lim(x→0) sin(x)/x = lim(x→0) cos(x)/1 = 1
4. Chain Rule
⚡ CRITICAL FOR DEEP LEARNING
The chain rule is the mathematical foundation of backpropagation in neural networks. It allows us to compute gradients through composed functions.
Single Variable
Multivariable (Partial Derivatives)
Neural Networks Example
5. Partial Derivatives
For functions with multiple variables, partial derivatives measure the rate of change with respect to one variable while keeping others constant.
Example
6. Gradient
⚡ CORE OF MACHINE LEARNING
The gradient is a vector containing all partial derivatives. It points in the direction of steepest ascent and is used in gradient descent optimization.
Definition
Example
Gradient Descent (How ML Models Learn)
where α = learning rate, L = loss function
7. Common Derivatives for ML/DL
Activation Functions
Function | Derivative |
---|---|
Sigmoid: σ(x) = 1/(1+e^-x) | σ(x)·(1-σ(x)) |
Tanh: tanh(x) | 1 - tanh²(x) |
ReLU: max(0, x) | 0 if x<0, 1 if x>0 |
Leaky ReLU | α if x<0, 1 if x>0 |
Loss Functions
Loss | Derivative |
---|---|
MSE: (ŷ-y)² | 2(ŷ-y) |
Cross-Entropy: -y·log(ŷ) | -y/ŷ |
8. Taylor Series
Taylor series approximates any smooth function as an infinite sum of polynomials. Used in numerical methods and approximations.
Common Series (around x=0)
9. Integration Tricks
U-Substitution
Integration by Parts (ILATE)
Order: Inverse, Log, Algebraic, Trig, Exponential
10. Definite Integrals
Fundamental Theorem of Calculus
Properties
11. Multivariable Calculus
Gradient
∇f = [∂f/∂x, ∂f/∂y, ∂f/∂z]
Direction of steepest ascent
Directional Derivative
D_u f = ∇f · u
Rate of change in direction u
Hessian Matrix
H = [∂²f/∂x² ∂²f/∂x∂y][∂²f/∂y∂x ∂²f/∂y²]
Second derivatives
12. Quick Reference for Neural Networks
Backpropagation Chain Rule
Gradient of Common Operations
Matrix Multiplication: y = Wx
Element-wise: y = x ⊙ w
Quick Tips for Exams
Practice Problems
Show Answers
Pro Tips
For ML/DL
Focus on Chain Rule, Partial Derivatives, and Gradient - they're the foundation of backpropagation.
For Exams
Memorize basic derivatives and integrals tables. Practice identifying which rule to use.
Practice
Do problems daily - calculus needs muscle memory. Start simple, then increase complexity.