To display logistic regression I’ll use the Titanic dataset from the Seaborn library. I’ll load the dataset, carry out some primary preprocessing, after which construct and consider a logistic regression mannequin.

`import pandas as pd`

import numpy as np

import matplotlib.pyplot as plt

import seaborn as snstitanic = sns.load_dataset('titanic')

# Drop rows with lacking values for simplicity

titanic.dropna(subset=['age', 'embarked'], inplace=True)

# Convert categorical columns to numerical values

titanic['sex'] = titanic['sex'].map({'male': 0, 'feminine': 1})

titanic['embarked'] = titanic['embarked'].map({'C': 0, 'Q': 1, 'S': 2})

# Choose options and goal variable

X = titanic[['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked']]

y = titanic['survived']

# Cut up the information into coaching and testing units

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Coaching The Mannequin

`from sklearn.linear_model import LogisticRegression`# Construct the logistic regression mannequin

log_reg = LogisticRegression(max_iter=1000)

log_reg.match(X_train, y_train)

# Make predictions on the check set

y_pred = log_reg.predict(X_test)

In the course of the coaching course of, the mannequin adjusts its parameters (coefficients and intercept) based mostly on the coaching information. The purpose is to seek out the coefficients that decrease the error between the anticipated chances and the precise labels.

For binary classification (e.g., predicting survival on the Titanic dataset), logistic regression calculates a choice boundary based mostly on realized coefficients. This boundary separates the function house into areas the place one class is extra seemingly than the opposite.

## Mannequin Analysis

`from sklearn.metrics import accuracy_score`

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

# Consider the mannequin

accuracy = accuracy_score(y_test, y_pred)

conf_matrix = confusion_matrix(y_test, y_pred)

class_report = classification_report(y_test, y_pred)print(f"Accuracy: {accuracy}")

print("Confusion Matrix:")

print(conf_matrix)

print("Classification Report:")

print(class_report)

`Accuracy: 0.6293706293706294`

Confusion Matrix:

[[73 14]

[39 17]]

Classification Report:

precision recall f1-score help0 0.65 0.84 0.73 87

1 0.55 0.30 0.39 56

accuracy 0.63 143

macro avg 0.60 0.57 0.56 143

weighted avg 0.61 0.63 0.60 143

**Accuracy:** The mannequin achieved an accuracy of 0.63, indicating that it appropriately predicted roughly 63% of cases total.

**Confusion Matrix:**

- The mannequin precisely predicted 73 cases of sophistication 0 (survived) and 17 cases of sophistication 1 (not survived) as true positives.
- Nonetheless, it incorrectly labeled 39 cases of sophistication 1 as false negatives and 14 cases of sophistication 0 as false positives.

**Class 0 (survived):**

- Precision: 65%
- Out of all cases predicted as class 0, 65% have been really class 0.
- Recall: 84%
- Of all cases that have been really class 0, the mannequin appropriately recognized 84%.

**Class 1 (not survived):**

- Precision: 55%
- Out of all cases predicted as class 1, 55% have been really class 1.
- Recall: 30%
- Of all cases that have been really class 1, the mannequin appropriately recognized 30%.

The ROC AUC (Receiver Working Attribute Space Underneath Curve) is a crucial metric used to evaluate the efficiency of binary classification fashions like logistic regression. It evaluates the flexibility of the mannequin to tell apart between lessons by plotting the true constructive fee (sensitivity) towards the false constructive fee (1 — specificity) at varied threshold settings. The next ROC AUC rating, starting from 0 to 1, signifies higher mannequin efficiency, with 1 representing excellent discrimination and 0.5 indicating random guessing.

`from sklearn.metrics import roc_curve, roc_auc_score`# Calculate chances for ROC curve

y_probs = log_reg.predict_proba(X_test)[:, 1]

# Calculate ROC curve

fpr, tpr, thresholds = roc_curve(y_test, y_probs)

# Calculate AUC rating

auc = roc_auc_score(y_test, y_probs)

print(f"AUC: {auc}")

# Plot ROC curve

plt.determine(figsize=(8, 6))

plt.plot(fpr, tpr, coloration='blue', lw=2, label=f'ROC Curve (AUC = {auc:.2f})')

plt.plot([0, 1], [0, 1], coloration='grey', linestyle='--', lw=1)

plt.xlabel('False Optimistic Charge')

plt.ylabel('True Optimistic Charge')

plt.title('ROC Curve')

plt.legend(loc='decrease proper')

plt.grid(True)

plt.present()

`AUC: 0.5697865353037767`