NeoGuardianAI

Protecting Users from Phishing Attacks

NeoGuardianAI is a sophisticated machine learning model designed to identify and flag potentially dangerous phishing URLs. With the rise in cyber threats, this tool serves as a digital guardian, helping users navigate the web safely.

The model analyzes various features of a URL to determine if it's legitimate or a phishing attempt, providing real-time protection against cyber threats.

Key Features

High accuracy (96.31%) in detecting phishing URLs
Analyzes multiple URL characteristics for comprehensive detection
Accessible through Hugging Face Spaces and API
User-friendly interface for easy URL checking

How the Model Was Generated

Development Process

1

Data Collection

The model was trained on the pirocheto/phishing-url dataset from Hugging Face, containing thousands of labeled URLs.
2

Feature Engineering

Extracted over 30 features from each URL, including length metrics, domain characteristics, special character counts, and suspicious patterns.
3

Model Selection

After evaluating multiple algorithms, XGBoost was selected for its superior performance in classification tasks and ability to handle complex feature relationships.
4

Training & Optimization

The model was trained with carefully tuned hyperparameters including max depth, learning rate, and regularization to prevent overfitting.
5

Evaluation & Deployment

After rigorous testing and validation, the model was deployed to Hugging Face Hub for public access and integrated into a Gradio web interface.

How NeoGuardianAI Works

URL Analysis Process

When a URL is submitted, NeoGuardianAI performs a comprehensive analysis:

Extracts features from the URL structure
Normalizes and scales the features
Passes the processed data through the XGBoost model
Generates a prediction with confidence score
Returns a user-friendly result indicating safety status

Key Features Analyzed

URL length and structure
Domain age and registration information
Presence of suspicious keywords
Special character frequency and distribution
TLD (Top-Level Domain) reputation
Presence of IP addresses in URL
Redirection patterns

Model Architecture

XGBoost Classifier

A gradient boosting framework that uses decision trees and gradient boosting to create a highly accurate prediction model.

XGBClassifier(
  max_depth=5,
  learning_rate=0.1,
  n_estimators=100,
  subsample=0.8,
  colsample_bytree=0.8,
  gamma=0.1,
  objective='binary:logistic',
  eval_metric='logloss'
)

Feature Processing

StandardScaler is used to normalize features, ensuring all inputs have similar scale for optimal model performance.

Decision Process

The model combines multiple decision trees, with each new tree correcting errors made by previous trees, resulting in high accuracy predictions.

Model Performance

Accuracy

96.31%

Precision

96.00%

Recall

96.66%

F1 Score

96.33%

Performance Analysis

NeoGuardianAI achieves exceptional performance across all key metrics, making it highly reliable for phishing URL detection:

Balanced Performance

The close values of precision and recall indicate the model is well-balanced, minimizing both false positives and false negatives.

Real-World Effectiveness

The high F1 score (96.33%) demonstrates the model's effectiveness in real-world scenarios where both precision and recall are important.

Comparison to Industry Standards

NeoGuardianAI's performance exceeds many commercial phishing detection solutions, which typically achieve 85-90% accuracy.

How to Use NeoGuardianAI

Web Interface

The easiest way to use NeoGuardianAI is through the Hugging Face Spaces web interface:

Visit the NeoGuardianAI Space
Enter a URL in the input field
Click "Check URL"
View the prediction result and confidence score

Try it now

API Integration

For developers, NeoGuardianAI can be integrated into applications using the Hugging Face Inference API:

import requests

API_URL = "https://api-inference.huggingface.co/models/Devishetty100/neoguardianai"
headers = {"Authorization": "Bearer YOUR_API_TOKEN"}

def query(url):
    payload = {"inputs": url}
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

result = query("https://example.com")
print(result)

Replace YOUR_API_TOKEN with your Hugging Face API token. The API returns a prediction and confidence score for the provided URL.