/ blog

# Logistic Regression

Supervised learning used for classification and especially binary classification.

Logistic (sigmoid) function for yes/no threshold:

P^i = o(z^i)

A(z) = 1/(1+e^-z)

z^i = W * X + b

There is also softmax function for multiple categorical. Also SVM and OVR.

You fit logistic regression by maximizing the likelihood function. This means you are maximizing the probability of classifying all categories correctly. You can then determine the coefficient values of the parameters.

Use gradient decent to maximize parameters by comparing predicted value to real value. Learning rate alpha is hyper parameter.

Newton's method has less iterations for same learning rate, but takes more memory.

# Binary classification

This can be expressed in a table called a confusion matrix.

Can measure accuracy, true positive rate, true negative rate, positive predictive value, false positive rate, false negative rate, F1: harmonic mean of precision and recall, AUC/ROC, and cross entropy.